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Abstract 

The  problem  of  simulating  the  hydrodynamics  and  the  acoustic  waves  inside  wind  musical 
instruments  such  as  the  recorder,  the  organ,  and  the  flute  is  considered.  The  problem 
is  attacked  by  developing  suitable  local-interaction  algorithms  and  a  parallel  simulation 
system  on  a  cluster  of  non-dedicated  workstations.  Physical  measurements  of  the  acoustic 
signal  of  various  flue  pipes  show  good  agreement  with  the  simulations.  Previous  attempts 
at  this  problem  have  been  frustrated  because  the  modeling  of  acoustic  waves  requires  small 
integration  time  steps  which  make  the  simulation  very  compute-intensive.  In  addition,  the 
simulation  of  subsonic  viscous  compressible  flow  at  high  Reynolds  numbers  is  susceptible  to 
slow-growing  numerical  instabilities  which  are  triggered  by  high-frequency  acoustic  modes. 
The  numerical  instabilities  are  mitigated  by  employing  suitable  explicit  algorithms:  lattice 
Holtzmann  method,  compressible  hnite  differences,  and  fourth-order  artihcial- viscosity  Li¬ 
ter.  Eurther,  a  technique  for  accurate  initial  and  boundary  conditions  for  the  lattice 
Holtzmann  method  is  developed,  and  the  second-order  accuracy  of  the  lattice  Holtzmann 
method  is  demonstrated. 

The  compute-intensive  requirements  are  handled  by  developing  a  parallel  simulation  sys¬ 
tem  on  a  cluster  of  non-dedicated  workstations.  The  system  achieves  80%  parallel  efhciency 
(speednp/processors)  using  20  HP-Apollo  workstations.  The  system  is  built  on  UNIX  and 
TCP/IP  communication  routines,  and  includes  automatic  process  migration  from  busy 
hosts  to  free  hosts. 
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Introduction 
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Figure  1-1:  Simulation  of  a  flue  pipe  that  is  20  cm  long,  1.34  cm  wide,  and  produces 
tones  near  400  and  1100  cycles  per  second.  Air  is  blown  through  the  flue  at  1200  cm/s. 
Iso-vorticity  contours  are  shown  at  25  milliseconds  after  startup. 

1.1  Thesis  outline 

I  have  considered  the  problem  of  simulating  the  hydrodynamics  and  the  acoustic  waves 
inside  wind  musical  instruments  such  as  the  organ  flue  pipe.  I  have  attacked  this 
problem  by  developing  suitable  local-interaction  algorithms  and  a  parallel  simulation 
system  on  a  cluster  of  non-dedicated  workstations.  Previous  attempts  at  this  problem 
have  been  frustrated  for  two  reasons:  First,  the  modeling  of  acoustic  waves  requires 
small  integration  time  steps  which  make  the  simulation  very  compute-intensive.  Sec¬ 
ond,  the  simulation  of  subsonic  viscous  compressible  flow  at  high  Reynolds  numbers 
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is  susceptible  to  slow-growing  numerical  instabilities  which  are  triggered  by  high- 
frequency  acoustic  modes. 

Below,  I  outline  the  main  results  of  my  thesis,  and  I  explain  how  my  work  hts  in 
with  previous  work  in  computational  fluid  dynamics  and  in  parallel  computing.  My 
contributions  belong  to  three  categories  as  follows: 

•  Physical  applications:  I  demonstrate  the  hrst  simulations  of  flue  pipes  ever-to- 
be-performed  which  model  both  hydrodynamics  and  acoustic  waves  together. 
Physical  measurements  of  the  acoustic  signal  of  various  flue  pipes  show  good 
agreement  with  the  simulations. 

•  Numerical  methods:  I  mitigate  the  problem  of  numerical  instabilities  by  em¬ 
ploying  a  fourth-order  artihcial- viscosity  hlter.  This  hlter  can  be  used  both 
with  the  lattice  Boltzmann  method  and  also  with  a  compressible  hnite  differ¬ 
ence  method.  Further,  I  develop  a  technique  for  accurate  boundary  conditions 
and  initial  conditions  for  the  lattice  Boltzmann  method,  and  I  demonstrate  the 
second-order  accuracy  of  the  lattice  Boltzmann  method. 

•  Parallel  computing:  I  handle  the  problem  of  compute-intensive  requirements  by 
developing  a  parallel  simulation  system  on  a  cluster  of  non-dedicated  worksta¬ 
tions.  The  system  is  based  on  local-interaction  methods,  small  communication 
capacity,  and  automatic  migration  of  parallel  processes  from  busy  hosts  to  free 
hosts.  Typical  simulations  achieve  80%  parallel  efficiency  (speedup/processors) 
using  20  HP-Apollo  workstations. 

Later  in  this  chapter,  I  present  a  few  representative  simulations  and  physical  mea¬ 
surements  of  the  sound  generated  by  a  soprano  recorder  flue  pipe.  More  simulations 
and  measurements  can  be  found  in  chapter  7.  Between  here  and  chapter  7,  the  tech¬ 
nical  crux  of  my  thesis  is  presented.  Specihcally,  the  equations  of  fluid  mechanics 
and  fluid  acoustics  are  reviewed  in  chapter  2.  Numerical  methods  for  simulating  fluid 
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flow  are  analyzed  in  chapters  3-5.  Parallel  computing  on  a  cluster  of  non-dedicated 
workstations  is  discussed  in  chapter  6. 

Regarding  numerical  methods,  I  emphasize  the  lattice  Boltzmann  method  because 
it  is  a  new  approach  for  simulating  fluids  which  is  promising,  and  is  still  undergoing 
refinements  and  improvements.  I  develop  a  technique  for  accurate  initial  and  bound¬ 
ary  conditions  for  the  lattice  Boltzmann  method  which  is  very  important  in  practical 
situations.  ^  Further,  I  demonstrate  experimentally  that  the  discretization  error  of 
the  lattice  Boltzmann  method  decreases  quadratically  with  finer  resolution  both  in 
space  and  in  time.  My  results  on  the  lattice  Boltzmann  method  have  been  published 
in  Skordos  [48],  and  have  helped  to  bring  the  lattice  Boltzmann  method  from  the 
physicists’  world  to  the  engineer’s  world. 

Apart  from  the  lattice  Boltzmann  method,  I  examine  two  different  kinds  of  ex¬ 
plicit  finite  difference  methods.  In  chapter  4,  I  compare  the  lattice  Boltzmann  method 
against  an  incompressible  finite  difference  method  which  neglects  the  acoustic  waves 
and  simulates  incompressible  flow.  In  chapters  6  and  7,  I  compare  the  lattice  Boltz¬ 
mann  method  against  a  compressible  finite  difference  method  which  solves  the  com¬ 
pressible  Navier  Stokes  equations.  The  lattice  Boltzmann  method  appears  to  model 
acoustic  waves  slightly  more  accurately  than  the  compressible  finite  difference  method. 
However,  my  comparisons  are  not  complete,  and  further  work  is  needed  to  understand 
better  the  differences  between  the  two  approaches. 

In  general,  I  can  say  that  the  lattice  Boltzmann  approach  has  better  stability 
properties  than  explicit  finite  difference  methods  because  the  lattice  Boltzmann  ap¬ 
proach  is  based  on  relaxation  as  opposed  to  differencing  operations.  The  ability  of  the 
lattice  Boltzmann  method  to  model  acoustic  waves  well,  which  I  mentioned  above, 
is  probably  related  to  the  stability  properties  and  the  smooth  behavior  of  the  lattice 
Boltzmann  method  for  disturbances  of  small  wavelength.  A  limitation  of  the  lattice 

^My  technique  also  makes  possible  multigrids  and  interpolation  between  different  grids  for  the 
lattice  Boltzmann  method  (see  section  4.6.2);  however,  I  have  not  tested  multigrids  in  actual  simu¬ 
lations  yet. 
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Boltzmann  approach  is  that  it  can  not  handle  arbitrary  non-uniform  grids.  This  limi¬ 
tation  may  be  overcome  to  some  extent  by  joining  grids  of  different  resolution  (see  my 
technique  for  boundary  conditions),  but  this  is  a  subject  for  future  research.  Here,  I 
employ  uniform  grids  only  because  they  are  simple  to  program,  to  understand,  and 
to  use  in  parallel  computation. 

1.2  Unexplored  area  of  fluid  dynamics 

The  simulation  of  fluid  flow  is  very  important  for  engineering  and  science  because 
fluid  phenomena  can  be  found  everywhere,  in  the  sky,  in  the  sea,  inside  engines, 
inside  our  bodies.  Thus,  there  is  great  motivation  for  simulating  fluids.  On  the  other 
hand,  the  simulation  of  fluid  phenomena  is  difficult  because  the  equations  of  motion 
(known  as  the  Navier  Stokes  equations)  are  nonlinear  partial  differential  equations 
that  exhibit  a  wide  range  of  dynamical  behavior  and  have  no  exact  solutions  in  most 
cases.  In  addition,  the  simulation  of  fluid  phenomena  requires  large  amounts  of  data 
to  represent  the  geometry  and  the  dynamics  of  the  flow  accurately.  Consequently, 
computers  are  challenged  to  their  limits  when  simulating  fluid  flow,  and  there  is  a 
never-ending  demand  for  increased  computing  power  to  enable  hner  and  more  realistic 
simulations. 

So  far,  the  held  of  computational  huid  dynamics  has  succeeded  in  simulating 
hows  of  many  different  types:  supersonic,  transonic,  how  through  porous  media, 
mixtures  of  huids,  free  surface  hows.  In  addition,  progress  has  been  made  towards 
faithful  simulation  of  turbulent  hows  and  hows  with  chemical  reactions.  Yet,  these 
achievements  are  only  the  beginning  of  a  long  exploration.  As  computer  technology 
improves  and  new  algorithms  are  discovered,  more  huid  phenomena  will  succumb  to 
simulation.  For  instance,  huid  phenomena  that  include  two  different  time-scales,  slow- 
moving  hydrodynamics  and  fast-moving  acoustic  waves,  are  now  possible  to  simulate 
numerically  using  parallel  computers,  as  I  demonstrate  in  my  thesis.  This  is  an  area 
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of  computational  fluid  dynamics  that  has  remained  unexplored  until  now. 

The  generation  of  sound  inside  wind  musical  instruments  such  as  the  organ,  the 
recorder,  and  the  flute  is  a  phenomenon  which  depends  on  the  interaction  between 
hydrodynamics  and  acoustic  waves.  Specihcally,  when  a  jet  of  air  impinges  a  solid 
obstacle  in  the  vicinity  of  a  cavity,  the  jet  begins  to  oscillate  strongly  and  produces 
acoustic  waves.  The  acoustic  waves  reflect  off  the  cavity,  and  return  to  interact  with 
the  jet  according  to  a  complex  nonlinear  feedback  cycle.  Similar  phenomena  that  de¬ 
pend  on  the  interaction  between  acoustic  waves  and  jets  occur  in  human  whistling  and 
in  voicing  of  fricative  consonants  (Shadle85  [46]).  The  computer  simulation  of  these 
phenomena  provides  a  precise  way  of  studying  the  phenomena  and  experimenting 
with  different  parameters. 

The  main  difficulties  that  have  prevented  simulations  of  subsonic  flow  inside  flue 
pipes  arise  from  the  fact  that  the  subsonic  flow  involves  two  different  time-scales, 
hydrodynamics  and  acoustic  waves,  which  interact  with  each  other  nonlinearly.  On 
the  one  hand,  the  simulation  is  compute-intensive  because  the  integration  time  step 
must  be  very  small  to  follow  the  acoustic  waves  (section  3.2.1).  On  the  other  hand,  the 
simulation  of  compressible  flow  is  susceptible  to  slow-growing  numerical  instabilities 
when  the  Reynolds  number  is  large.  I  handle  the  compute-intensive  requirements 
by  developing  a  parallel  simulation  system  on  a  cluster  of  workstations.  In  addition, 
I  mitigate  the  numerical  instabilities  by  employing  a  fourth-order  artihcial- viscosity 
hlter  (chapter  5)  in  combination  with  the  lattice  Boltzmann  method  and  also  in 
combination  with  a  compressible  hnite  difference  method. 

The  traditional  approach  of  simulating  subsonic  flow  is  to  approximate  the  sub¬ 
sonic  flow  with  a  perfectly  incompressible  flow,  as  dehned  in  section  2.4.3.  The 
incompressible  flow  approximation  ignores  the  propagation  of  acoustic  waves  (it  as¬ 
sumes  inhnitely  fast  propagation),  and  allows  the  use  of  large  integration  time  steps 
(Peyret&Taylor  [38]).  Such  an  approach  is  valid  when  the  acoustic  waves  play  a 
secondary  role  from  a  physical  point  of  view:  for  example,  when  the  time-scale  of 
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acoustic  waves  does  not  influence  the  main  flow,  and  when  we  are  not  interested  in 
the  generation  of  acoustic  waves.  The  incompressible  flow  approach  is  also  valid  when 
we  are  interested  in  the  generation  of  acoustic  waves,  but  the  acoustic  waves  do  not 
interact  with  the  hydrodynamics.  In  such  a  case,  the  incompressible  flow  solution 
can  be  computed  separately  and  then  used  as  a  source  term  to  the  wave  equation 
(Harding  [24]).  Moreover,  the  wave  equation  can  be  linearized,  and  can  be  solved 
using  analytic  approximations  (Green  function  integrals,  for  example)  avoiding  the 
cost  of  a  direct  numerical  solution. 

The  incompressible  flow  approximation  is  a  good  idea  when  the  propagation  of 
acoustic  waves  does  not  influence  the  dynamics  of  the  phenomenon.  However,  it  is 
inappropriate  when  the  flow  problem  depends  on  the  interaction  between  hydrody¬ 
namics  and  acoustic  waves  (the  flow  of  air  inside  flue  pipes,  for  example).  The  only 
way  to  simulate  correctly  such  a  flow  is  to  simulate  both  the  hydrodynamics  and  the 
acoustic  waves  together.  In  other  words,  the  only  way  to  simulate  such  a  problem  is 
to  solve  numerically  the  compressible  Navier  Stokes  equations,  and  to  compute  the 
time-dependent  evolution  of  the  flow  and  the  acoustic  waves.  This  is  the  subject  of 
my  thesis. 

1.3  Local-interaction  parallel  computing 

Parallel  computing  is  necessary  in  order  to  perform  high  resolution  simulations  of  hy¬ 
drodynamics  and  acoustic  waves.  To  this  end,  I  have  developed  a  parallel  system  on  a 
cluster  of  25  non-dedicated  workstations.  The  system  achieves  concurrency  by  decom¬ 
posing  the  simulated  area  into  subregions  and  by  assigning  the  subregions  to  parallel 
subprocesses  on  different  workstations.  The  use  of  explicit  numerical  methods  leads  to 
small  communication  requirements.  The  parallel  subprocesses  automatically  migrate 
from  busy  hosts  to  free  hosts  in  order  to  exploit  the  unused  cycles  of  non-dedicated 
workstations,  and  to  avoid  disturbing  the  regular  users.  The  system  achieves  80% 
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parallel  efficiency  (speedup/processors)  using  20  HP- Apollo  workstations  in  a  cluster 
where  there  are  25  non-dedicated  workstations  total.  ^ 

In  chapter  6,  I  describe  the  implementation  of  the  parallel  simulation  system, 
and  I  present  detailed  measurements  of  the  parallel  efficiency  (speedup/processors) 
of  2D  and  3D  simulations  of  fluid  dynamics.  Further,  I  develop  a  theoretical  model 
of  efficiency  which  hts  closely  the  measurements.  The  measurements  show  that  the 
shared-bus  Ethernet  network  is  adequate  for  two-dimensional  simulations  of  fluid 
dynamics,  but  limited  for  three-dimensional  ones.  I  expect  that  new  technologies 
in  the  near  future  such  as  Ethernet  switches,  FDDI  and  ATM  networks  will  make 
practical  three-dimensional  simulations  of  fluid  dynamics  on  a  cluster  of  workstations. 

It  is  worth  emphasizing  that  the  success  of  my  parallel  simulation  system  depends 
considerably  on  the  use  of  explicit  methods.  This  is  because  explicit  methods  are 
completely  parallelizable,  and  lead  to  small  communication  requirements  which  can 
be  satished  on  a  cluster  of  workstations.  The  disadvantage  of  explicit  methods  is 
that  small  integration  time  steps  are  required  for  numerical  stability.  However,  the 
simulation  of  subsonic  flow  requires  small  integration  time  steps,  anyways,  to  model 
the  fast-moving  acoustic  waves.  Thus,  there  is  a  match  between  the  requirements  of 
the  problem  and  the  requirements  of  explicit  methods.  In  addition,  there  is  a  match 
between  the  problem,  the  algorithms,  and  the  computer  system. 

In  general,  explicit  methods  are  desirable  for  parallel  computing  when  increasing 


major  motivation  for  developing  parallel  computing  on  a  cluster  of  workstations  has  been  the 
high  availability  of  workstations  compared  to  other  parallel  computers.  At  the  Artihcial  Intelligence 
Laboratory  and  the  Laboratory  for  Computer  Science  at  MIT  where  I  have  done  most  of  this  work, 
there  is  a  Connection  Machine  CM-5  with  128  processors,  but  the  machine  is  time-shared  by  too 
many  people.  There  are  typically  10  users  sharing  the  128  processors  on  the  average,  which  reduces 
the  computation  power  to  12  processors  per  user  at  best.  This  processing  power  is  not  enough  for 
my  purposes. 

The  computational  speed  of  an  HP9000/715  workstation  is  approximately  3-4  times  the  compu¬ 
tational  speed  of  one  processor  of  the  CM-5.  Thus,  a  distributed  simulation  using  20  HP9000/715 
workstations  is  equivalent  approximately  to  60-80  processors  of  the  CM-5  running  in  dedicated  mode. 
Of  course,  this  comparison  only  applies  to  special  problems  that  have  a  small  ratio  of  communica¬ 
tion  to  computation.  Other  problems  that  have  large  communication  requirements  would  not  run 
efficiently  on  my  distributed  system.  Such  problems  might  run  efficiently  on  a  parallel  computer 
such  as  the  CM-5  that  has  a  powerful  communication  network. 
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numbers  of  local  processing  units  are  available  with  minimum  communication  capacity 
between  the  processing  units.  Such  computers  may  be  widespread  in  the  future;  for 
instance,  a  future  parallel  computer  may  consist  of  millions  of  local  processing  units, 
each  unit  having  the  power  of  one  of  today’s  workstations.  Communication  is  going 
to  dominate  the  cost  of  such  computers,  and  methods  that  minimize  communication 
are  going  to  be  desirable.  With  this  perspective  in  mind,  the  work  presented  herein 
for  a  cluster  of  25  workstations,  may  have  applications  to  future  parallel  computers 
as  well. 

1.3.1  Comparison  with  other  work  in  parallel  computing 

The  suitability  of  local-interaction  algorithms  for  parallel  computing  on  a  cluster  of 
workstations  has  been  demonstrated  in  previous  works,  such  as  [7],  [9],  and  elsewhere. 
Cap&Strumpen  [7]  present  the  PARFORM  system  and  simulate  the  unsteady  heat 
equation  using  explicit  hnite  differences.  Chase&et  ah  [9]  present  the  AMBER  sys¬ 
tem,  and  solve  Laplace’s  equation  using  Successive  Over- Relaxation.  The  present 
work  emphasizes,  and  clarihes  further  the  importance  of  local-interaction  methods 
for  parallel  systems  with  small  communication  capacity.  Furthermore,  a  real  problem 
of  science  and  engineering  is  solved  using  the  present  approach.  The  problem  is  the 
simulation  of  subsonic  flow  with  acoustic  waves  inside  wind  musical  instruments. 

In  the  fluid  dynamics  community,  little  attention  has  been  given  so  far  to  simula¬ 
tions  of  hydrodynamics  and  acoustic  waves.  The  reason  is  that  such  simulations  are 
very  compute-intensive,  and  can  be  performed  only  when  parallel  systems  such  as  the 
one  described  herein  are  available.  Furthermore,  the  fluid  dynamics  community  has 
generally  shunned  the  use  of  explicit  methods  because  explicit  methods  require  small 
integration  time  steps  (see  section  3.2).  With  the  increasing  availability  of  parallel 
systems,  explicit  methods  are  now  attracting  more  attention  in  all  areas  of  computa¬ 
tional  fluid  dynamics.  The  present  work  clearly  reveals  the  power  of  explicit  methods 
in  one  particular  area,  and  should  motivate  further  work  in  explicit  methods  and 
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local-interaction  algorithms. 

Regarding  parallel  efficiency  (speedup/processors),  the  efficiency  of  my  parallel 
simulation  system  is  very  good,  80%  typically.  My  measurements  of  the  efficiency 
(section  6.7)  are  more  detailed  than  any  other  reference  that  I  know,  especially  for 
the  case  of  a  shared-bus  Ethernet  network.  I  also  develop  a  model  of  parallel  efficiency 
in  section  6.8,  which  is  based  on  simple  ideas  that  have  been  discussed  previously, 
for  example  in  Fox  et  ah  [19]  and  elsewhere.  I  compare  the  predictions  of  this  model 
against  real  measurements  of  the  parallel  efficiency. 

Regarding  the  problem  of  using  non-dedicated  workstations,  I  handle  this  prob¬ 
lem  by  employing  automatic  process  migration  from  busy  hosts  to  free  hosts.  An 
alternative  approach  which  has  been  used  elsewhere  is  the  dynamic  allocation  of  pro¬ 
cessor  workload.  In  the  present  context,  dynamic  allocation  means  to  enlarge  and  to 
shrink  the  subregions  which  are  assigned  to  each  workstation  depending  on  the  CPU 
load  of  the  workstation  (Cap&Strumpen  [7]).  Although  this  approach  is  important 
in  various  applications  (Blumofe&Park  [5]),  it  seems  unnecessary  for  simulating  fluid 
flow  problems  with  static  geometry.  For  such  problems,  it  may  be  simpler  and  more 
effective  to  use  hxed  size  subregions  per  processor,  and  to  apply  automatic  migration 
of  processes  from  busy  hosts  to  free  hosts.  This  approach  has  worked  very  well  in  the 
parallel  simulations  presented  here. 

Regarding  the  design  of  the  parallel  simulation  system,  I  have  aimed  for  sim¬ 
plicity.  In  particular,  the  special  constraints  of  local-interaction  problems  and  static 
decomposition  have  guided  the  design  of  the  parallel  system.  The  automatic  mi¬ 
gration  of  processes  has  been  implemented  in  a  straightforward  manner  because  the 
system  is  very  simple.  The  availability  of  a  homogeneous  cluster  of  workstations, 
and  a  common  hie  system  have  also  simplihed  the  implementation,  which  is  based 
on  UNIX  and  TCP/IP  communication  routines.  The  approach  presented  here  works 
well  for  spatially-organized  computations  which  employ  a  static  decomposition  and 
local-interaction  algorithms. 
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My  thesis  does  not  examine  issues  such  as  high-level  parallel  programming,  parallel 
languages,  and  inhomogeneous  clusters  of  workstations.  Efforts  along  these  directions 
are  the  PVM  system  (Sunderam  [50]),  the  Linda  system  (Carriero  [8]),  the  packages  of 
(Kohn&Baden  [30])  and  (Chesshire&Naik  [If])  that  facilitate  parallel  decomposition, 
the  Orca  language  for  distributed  computing  (Bal&et  ah  [1]),  etc. 

1.4  Some  simulation  results 

This  section  describes  a  few  representative  simulations  and  physical  measurements  of 
the  musical  tones  generated  by  a  soprano  recorder  flue  pipe. 

1.4.1  Flue  pipe  of  a  soprano  recorder 

The  recorder  is  a  ZEN-ON  SB-DX  soprano  recorder,  made  in  Japan,  and  commonly 
available  in  music  stores.  The  recorder  consists  of  three  parts  which  are  made  out  of 
plastic,  and  which  connect  together  to  make  the  recorder  (see  hgure  1-2). 

•  The  head  of  the  recorder  consists  of  the  flue  (narrow  passage  where  the  jet  of 
air  is  formed),  the  labium  (sharp  edge  which  the  jet  impinges),  and  a  short 
cylindrical  pipe  of  length  6.1  cm  and  diameter  1.34  cm. 

•  The  main  pipe  of  the  recorder  is  designed  to  attach  to  the  head  of  the  recorder. 
The  main  pipe  is  cylindrical,  it  tapers  along  its  length,  and  includes  hnger-holes 
for  playing  different  tones. 

•  The  end-piece  of  the  recorder  is  designed  to  attach  to  the  end  of  the  main 
pipe.  The  end-piece  has  a  flaring  shape,  and  includes  one  double-hnger-hole  for 
playing  the  lowest  notes  C  and  C^  of  the  recorder. 

Eor  the  purpose  of  testing  the  basic  phenomenon  of  tone  generation  by  the  recorder, 
the  hnger-holes  and  the  tapering  shape  of  the  recorder  are  not  necessary,  and  they 
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Figure  1-2:  A  three-piece  soprano  recorder. 


are  omitted  here.  Specihcally,  the  main  pipe  of  the  recorder  is  replaced  with  a  new 
pipe  which  has  constant  diameter  and  no  hnger-holes.  The  new  pipe  is  connected 
to  the  head  of  the  recorder  which  is  6.1  cm  long.  The  addition  of  the  new  pipe 
results  in  lengths  such  as  20  cm  which  are  typical  of  soprano  recorders.  It  should  be 
noted  that  the  attached  pipe  has  a  slightly  smaller  diameter  1.27  cm  than  the  head 
of  the  recorder  1.34  cm.  This  difference  is  very  small,  however,  and  is  neglected  in 
the  computer  simulations.  The  attached  pipe  is  closed  at  the  far  end  in  the  present 
experiments  (see  chapter  7  for  simulations  of  open-end  pipes). 
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Figure  1-3:  Soprano  recorder  flue,  20  cm  pipe.  The  numbers  shown  correspond  to 
millimeters. 

Figures  1-3  and  1-4  show  the  recorder  according  to  a  2D  simplihed  geometry  which 
is  used  in  the  simulations.  The  gray  areas  correspond  to  the  walls  around  the  recorder. 
The  walls  above  the  recorder  are  skipped  in  the  simulation  in  order  to  reduce  the 
computational  effort.  The  pipe  is  located  at  the  bottom  of  the  picture,  and  measures 
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Figure  1-4:  A  smaller  outlet  region  than  figure  1-3. 


20  cm  long  and  1.34  cm  wide.  The  flue  (or  flue  channel)  is  located  at  the  bottom  left 
corner,  and  measures  4  cm  long  and  0.1  cm  wide.  At  a  distance  of  0.4  cm  in  front 
of  the  orihce  of  the  flue  (where  the  jet  of  air  emerges),  there  is  a  sharp  edge  which  is 
called  the  labium.  The  labium  measures  an  angle  of  14  degrees  approximately,  and 
is  positioned  slightly  below  the  midline  of  the  flue  channel.  Specihcally,  the  tip  of 
the  labium  is  located  at  1.34  cm  from  the  bottom  of  the  pipe,  and  the  flue  channel  is 
located  between  1.3  cm  and  1.4  cm. 


Figure  1-5:  The  flue  and  the  labium  in  three  dimensions.  Not  drawn  to  scale.  The 
numbers  correspond  to  centimeters. 
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In  three-dimensions,  the  pipe  of  the  recorder  is  a  cylinder,  and  the  flue  channel  and 
the  labium  are  approximately  rectangular  as  shown  in  hgure  1-5.  The  flue  channel  is 
slightly  curved  along  the  sides  which  measure  0.99  cm  and  0.93  cm,  but  the  curvature 
is  very  small  and  is  neglected  here.  Further,  the  flue  channel  tapers  slightly  along 
the  side  which  measures  4.0  cm.  Specihcally,  the  flue  channel  measures  0.13  cm  by 
0.99  cm  at  the  inlet  (where  air  is  blown  into  the  recorder),  and  it  measures  0.10  cm 
by  0.93  cm  at  the  orihce  (where  the  air  emerges  to  strike  the  labium).  The  tapering 
of  the  flue  channel  is  neglected  in  the  computer  simulations  because  it  is  very  small. 

The  Reynolds  number  of  the  flow  of  air  inside  the  soprano  recorder  ranges  between 
500  and  1700.  The  Reynolds  number  is  dehned  as  the  mean  speed  of  the  jet  of  air 
inside  the  flue  channel  (typical  speeds  are  between  800  and  2500  cm/s)  times  the  width 
of  the  flue  channel  0.1  cm,  and  divided  by  the  kinematic  viscosity  of  air  0.15  cm^/s 
(see  section  2.6  for  details).  High  Reynolds  numbers  typically  produce  turbulent  flow 
which  involves  very  small  length  scales,  and  is  difficult  to  simulate  numerically.  In 
the  case  of  a  narrow  jet  of  air  0.1  cm  wide,  a  Reynolds  number  above  500  is  rather 
high,  so  that  the  jet  is  very  unstable  and  becomes  turbulent  after  exiting  the  orihce 
and  impinging  the  labium.  Although  the  computer  simulations  can  not  model  the 
hue  scales  of  turbulence  (the  grid  size  is  only  Ax  =  0.01  cm),  an  artihcial- viscosity 
hlter  is  used  which  dissipates  small-wavelengths  in  a  pseudo-turbulent-like  fashion 
(see  section  5.5).  It  appears  that  a  precise  model  of  turbulence  is  not  necessary  to 
reproduce  the  basic  operation  of  the  hue  pipe.  Further  investigation  of  this  issue 
should  be  done  in  the  future. 

1.4.2  Computer  simulations 

Simulation  results  using  the  lattice  Boltzmann  method  and  the  compressible  hnite 
difference  method  of  section  3.3  are  presented  here.  The  simulations  are  based  on  the 
geometry  shown  in  hgure  1-3  for  the  lattice  Boltzmann  method,  and  on  the  geometry 
shown  in  hgure  1-4  for  the  hnite  difference  method.  The  two  geometries  are  almost 
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Figure  1-6:  Simulation  of  a  20  cm  flue  pipe.  The  decomposition  10  X  6  is  shown  as 
dashed  lines.  22  workstations  are  used.  The  gray-shaded  areas  are  not  simulated. 

identical  except  that  the  outlet  region  is  8.0  cm  wide  in  the  former,  and  is  5.8  cm 
wide  in  the  latter.  The  reason  for  this  difference  is  purely  accidental  (availability  of 
workstations),  and  is  too  small  to  affect  the  results  signihcantly.  However,  it  should 
be  noted  that  very  small  outlet  regions  become  quickly  saturated  with  the  vorticity 
generated  by  the  flue,  and  complicate  the  simulation.  Thus,  the  size  of  the  outlet 
region  should  be  as  large  as  possible  within  one’s  computational  constraints. 

In  the  simulations,  the  air  is  forced  through  the  inlet  (the  entrance  of  the  flue 
channel),  and  exits  through  the  outlet  (the  top  part  of  the  picture).  During  the 
initial  blowing  of  the  air  into  the  flue  channel,  the  imposed  density  and  velocity  at 
the  inlet  rise  smoothly  to  hnal  values  within  3  ms  (see  section  7.3.2  for  more  details). 
Appropriate  boundary  conditions  at  the  inlet  and  the  outlet  (see  section  7.3)  maintain 
the  air  flow  through  the  recorder,  and  prevent  reflection  of  acoustic  waves  at  the  inlet 
and  the  outlet.  All  other  boundaries  are  solid  walls  and  reflect  the  acoustic  waves 
which  are  generated  by  the  flue. 

The  spatial  resolution  of  all  the  simulations  presented  in  this  section  is  Ax  = 
0.01  cm.  This  resolution  corresponds  to  10  fluid  nodes  along  the  width  0.1  cm  of  the 
flue  channel  (see  hgures  1-7  and  1-8),  and  produces  adequate  results.  Finer-resolution 
simulations  of  flue  pipes  have  also  been  performed  (for  example,  13  nodes  along  the 
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Figure  1-7:  The  grid  at  the  flue-labium  region,  there  are  10  fluid  nodes  along  the 
width  0.1  cm  of  the  flue  channel. 


Figure  1-8:  Magnihed  view  of  the  orihce  and  the  labium  according  to  the  simulations. 

width  of  the  flue  channel),  and  the  results  do  not  change  very  much.  Fewer  than  10 
nodes  along  the  width  of  the  flue  channel  are  not  recommended  because  the  ratio  of 
the  width  of  the  flue  channel  divided  by  the  width  of  the  tip  of  the  labium  (one  Ax 
wide)  should  be  at  least  10  :  1  in  order  to  produce  a  “sharp”  labium  and  in  order  to 
position  the  labium  along  the  width  of  the  flue  channel  with  an  accuracy  of  0.01  cm. 

The  integration  time  step  is  determined  from  the  requirement  that  the  numerical 
speed  Ax ! At  must  be  of  the  order  of  the  speed  of  sound  =  34400  cm/s.  Accord¬ 
ingly,  the  time  step  is  kept  very  small,  for  example  At  =  2.1  xl0“^  s,  which  makes  the 
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10.2 

Table  1.1 

:  Frequencies 

,  lattice  Boltzmann,  20  cm  closed-end  recorder 

hmean 

/o 

(Ao) 

Aq 

/i 

(Ai) 

Ai 

/2 

(A2) 

A2 

cm/s 

Hz 

(cm) 

10“® 

Hz 

(cm) 

10“® 

Hz 

(cm) 

10“® 

838 

424 

(81) 

0.12 

326 

(106) 

1.01 

1134 

(30) 

0.52 

1113 

1116 

(31) 

1.39 

420 

(82) 

3.69 

244 

(141) 

1.98 

1634 

1882 

(18) 

1.89 

1182 

(29) 

7.85 

329 

(104) 

6.58 

2082 

1957 

(18) 

4.26 

377 

(91) 

25.1 

1143 

(30) 

10.1 

Table  1.2:  Frequencies,  compressible  finite  difference,  20  cm  closed-end  recorder 


hmean 

cm/s 

/o 

Hz 

(Ao) 

(cm) 

Aq 

10-1 

/i 

Hz 

(Ai) 

(cm) 

Ai 

10-2 

/2 

Hz 

(A2) 

(cm) 

A2 

10-® 

734 

395 

(87) 

1.051 

1186 

(29) 

3.177 

2768 

(12) 

10.55 

1140 

nil 

(31) 

1.095 

401 

(86) 

8.754 

1915 

(18) 

14.63 

1558 

1140 

(30) 

2.016 

1879 

(18) 

0.996 

398 

(87) 

7.557 

1985 

1145 

(30) 

2.676 

3438 

(10) 

0.959 

5730 

(6) 

6.169 

2420 

1918 

(18) 

2.947 

3836 

(9) 

3.015 

7670 

(4.5) 

0.889 

Table  1.3:  Frequencies,  physical  measurements,  20  cm  closed-end  recorder 


simulation  very  compute-intensive,  and  makes  parallel  computing  a  necessity.  Typi¬ 
cal  simulations  correspond  to  30  ms,  and  require  150000  integration  steps.  Figure  1-6 
shows  a  typical  decomposition  of  the  geometry  of  a  flue  pipe  into  subregions  for  the 
purpose  of  parallel  computing.  The  decomposition  10  X  5  is  shown  as  dashed  lines. 
The  gray-shaded  areas  are  not  simulated,  only  the  white  areas  are  simulated.  There 
are  22  rectangular  subregions  which  are  active,  and  are  assigned  to  22  workstations. 
Fach  workstation  can  update  39100  fluid  nodes  per  second  (when  the  lattice  Boltz¬ 
mann  method  is  used,  see  chapter  6),  and  the  parallel  efficiency  is  approximately 
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20  cm  pipe 

/o 

(Ao) 

/l 

(Ai) 

/2 

(A2) 

fs 

(As) 

h 

(A4) 

Hz 

(cm) 

Hz 

(cm) 

Hz 

(cm) 

Hz 

(cm) 

Hz 

(cm) 

open-closed 

430 

(80) 

1290 

(26.7) 

2150 

(16) 

3010 

(11.4) 

3870 

(8.9) 

open-open 

860 

(40) 

1720 

(20) 

2580 

(13.3) 

3440 

(10) 

4300 

(8) 

Table  1.4:  Ideal  resonant  frequencies,  20  cm,  open-closed  and  open-open. 

80%.  It  takes  about  48  hours  of  running-time  to  perform  150000  integration  steps 
using  0.79  million  fluid  nodes. 

Figures  1-15  to  1-18  show  acoustic  signals  obtained  from  simulations  of  the  20  cm 
closed-end  recorder  using  the  lattice  Boltzmann  method.  Corresponding  results  using 
the  compressible  hnite  difference  method  are  shown  in  hgures  1-19  to  1-22.  The 
major  frequencies  of  the  acoustic  signals  are  summarized  in  tables  1.1  and  1.2.  For 
comparison  purposes,  frequencies  obtained  from  physical  measurements  are  shown  in 
table  1.3  (they  are  discussed  in  the  next  section),  and  the  ideal  resonant  frequencies  of 
a  passive  pipe  20  cm  long  are  shown  in  table  1.4  (again  explained  in  the  next  section). 

A  sampling  interval  of  approximately  3.09  X  10“®  s  is  used  in  the  computer  sim¬ 
ulations,  which  corresponds  to  a  maximum  frequency  of  16.2  kHz.  Frequencies  of 
interest  are  less  than  5  kHz,  and  are  shown  in  the  hgures;  frequencies  higher  than 
5  kHz  are  not  shown  because  they  are  of  very  small  amplitude.  Fach  hgure  plots 
the  acoustic  signal  in  the  time  domain  at  the  bottom,  and  in  the  frequency  domain 
at  the  top.  In  the  time  domain,  the  acoustic  signal  is  shown  as  the  relative  density 
(a  non-dimensional  number).  In  the  frequency  domain,  the  acoustic  signal  is  shown 
as  the  pressure  normalized  by  a  standard  pressure  level  of  2x10““^  gmcm/s^  (see 
section  2.6).  Also,  in  the  frequency  domain  the  acoustic  signal  is  plotted  according 
to  a  logarithmic  scale  of  201og;^Q  decibel  (dB),  so  that  a  gain  of  20  dB  corresponds  to 
a  ratio  of  10  in  amplitude. 

We  notice  that  the  computer  simulations  predict  acoustic  signals  with  amplitudes 
near  100  dB,  which  may  seem  too  large  for  a  recorder,  but  it  should  be  noted  that 
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Figure  1-9:  The  flow  during  the  initial  blowing  of  air  into  the  flue  pipe.  Frames  are 
0.49  ms  apart,  from  left  to  right.  Iso-vorticity  contours  are  plotted. 

the  simulation  is  two-dimensional  (the  sound  spreads  as  1/r  in  2D  versus  1/C  in  3D), 
and  the  acoustic  signal  is  sampled  inside  a  small  outlet  cavity  very  near  the  labium 
(approximately  5  cm  above  the  labium).  Thus,  acoustic  signals  with  amplitude  near 
100  dB  are  not  surprising. 

We  also  notice  that  the  acoustic  signals  predicted  by  the  lattice  Boltzmann  and 
the  compressible  finite  difference  methods  are  similar,  but  slightly  different.  Possible 
reasons  for  the  differences  are  the  following:  The  modeling  of  boundary  conditions 
is  different  between  lattice  Boltzmann  and  finite  differences  because  the  computa¬ 
tional  structure  of  the  methods  is  very  different.  Also,  the  lattice  Boltzmann  method 
can  model  the  high-frequency  components  of  acoustic  waves  more  accurately  than 
the  compressible  finite  difference  method.  The  above  differences  between  the  lat¬ 
tice  Boltzmann  method  and  the  compressible  finite  difference  method  are  not  well 
understood  at  present.  Future  work  is  needed  to  understand  them. 
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Figure  1-10:  Jet  oscillations  in  the  flue-labium  region.  Frames  are  0.33  ms  apart, 
from  left  to  right.  Iso-vorticity  contours  are  plotted. 

To  get  an  idea  of  how  the  jet  of  air  moves  inside  the  flue  pipe,  hgures  1-9  to  1-13 
show  sequences  of  pictures  of  the  flue-labium  region  from  simulations  using  the  lattice 
Boltzmann  method.  Similar  pictures  are  obtained  using  the  hnite  difference  method. 
Figures  1-9,  1-10  come  from  a  simulation  of  a  closed-end  soprano  recorder  which  is 
6.1  cm  long  and  generates  a  tone  of  1000  Hz  (the  blowing  speed  is  900  cm/s,  and  a 
complete  picture  of  this  recorder  is  shown  in  hgure  6-1  of  chapter  6).  Figures  1-11 
to  1-13  come  from  a  simulation  of  a  20  cm  closed-end  recorder  blown  at  1104  cm/s. 
Figure  1-11  shows  vorticity  iso-contours,  hgure  1-12  shows  the  velocity  vector  held, 
and  hgure  1-13  shows  kinetic  energy  iso-contours  calculated  as  +  V2  and  clipped 
between  the  values  1  —  2  X  10®  (cm/s)^. 

Figure  1-9  illustrates  the  very  beginning  of  blowing  air  into  the  recorder,  and 
hgures  1-10  to  1-13  illustrate  the  oscillations  of  the  jet  after  startup.  Initially,  the 
jet  of  air  turns  outwards,  and  moves  outside  of  the  labium.  This  is  simply  because 
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Figure  1-11:  Jet  oscillations  of  the  20  cm  closed-end  recorder  at  blowing  speed 
1104  cm/s.  Frames  are  0.22  ms  apart,  from  left  to  right.  Iso-vorticity  contours 
are  plotted.  35.6  ms  after  startup. 

the  pressure  is  smaller  outside  the  pipe  than  inside.  Subsequently,  the  jet  begins  to 
buckle,  and  starts  to  oscillate  up  and  down.  Meanwhile,  the  acoustic  waves  inside 
the  pipe  travel  back  and  forth  and  build  strong  acoustic  energy  inside  the  pipe.  The 
acoustic  waves  interact  with  the  jet  so  that  the  jet  oscillates  at  frequencies  near  the 
resonant  frequencies  of  the  pipe.  Fxactly  how  this  happens  is  not  known  (section  7.2), 
but  simple  models  have  been  proposed  (Verge94  [57,  56],  Hirschberg  [26]).  It  would 
be  an  interesting  future  project  to  test  these  models  against  the  precise  data  which 
can  be  obtained  from  the  present  simulations. 
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Figure  1-12:  Jet  oscillations  of  the  20  cm  closed-end  recorder  at  blowing  speed 
1104  cm/s.  Frames  are  0.22  ms  apart,  from  left  to  right.  The  velocity  vector  held  is 
plotted  at  1  :  4  the  actual  grid  resolution.  35.6  ms  after  startup. 

1.4.3  Physical  measurements 

Comparing  the  simulations  against  physical  measurements  is  very  important  because 
the  physical  measurements  provide  information  of  how  close  to  reality  the  computer 
simulations  are.  Although  the  numerical  accuracy  of  a  numerical  method  can  be 
tested  on  simple  how  problems  which  possess  exact  solutions  (this  is  done  in  chapter  4 
for  the  lattice  Boltzmann  method),  the  numerical  accuracy  on  simple  problems  does 
not  guarantee  that  the  modeling  of  a  physical  phenomenon  is  correct.  There  are  many 
other  factors  that  come  into  play  when  a  real  phenomenon  is  simulated.  For  instance. 
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Figure  1-13:  Jet  oscillations  of  the  20  cm  closed-end  recorder  at  blowing  speed 
1104  cm/s.  Frames  are  0.22  ms  apart,  from  left  to  right.  Kinetic  energy  iso-contours 
are  plotted.  35.6  ms  after  startup. 

the  underlying  differential  equations  which  are  solved  numerically  (chapter  2)  may 
miss  some  important  effect  of  the  physical  phenomenon  under  consideration.  Also, 
the  numerical  boundary  conditions  are  often  a  poor  model  of  the  physical  boundary 
conditions  (for  example,  the  practically-inhnite  outlet  region  above  the  recorder  must 
be  approximated  with  a  small  outlet  region  in  the  simulations).  Thus,  there  is  always 
some  uncertainty  about  the  physical  modeling,  which  makes  the  comparison  between 
simulations  and  physical  measurements  very  important. 


In  the  physical  measurements  presented  in  this  section,  a  mechanical  air  supply 
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Figure  1-14:  The  setup  for  physical  measurements.  Not  drawn  to  scale. 

is  used  to  blow  air  into  the  recorder.  The  air  passes  through  a  regulating  valve  and  a 
flow-meter  before  reaching  the  recorder,  as  shown  in  hgure  1-14.  Thus,  the  response 
of  the  recorder  can  be  measured  for  different  blowing  speeds.  The  generated  acoustic 
signal  is  measured  by  means  of  a  CT329  microphone,  which  is  placed  at  a  distance  of 
approximately  100  cm  away  from  the  recorder.  The  analog  signal  from  the  microphone 
is  digitized  using  a  SONY  portable  computer  with  an  internal  A/D  converter.  Then, 
a  Fourier  transform  is  performed  to  calculate  the  frequency  spectrum. 

Figures  1-23  to  1-27  show  acoustic  signals  obtained  from  physical  measurements  of 
the  20  cm  closed-end  recorder,  and  table  1.3  summarizes  the  frequencies.  The  acoustic 
signals  are  sampled  during  steady  state  (a  few  seconds  after  the  initial  blowing  of  air 
into  the  recorder).  The  sampling  interval  is  2.65  X  10“®  s,  and  corresponds  to  a 
maximum  frequency  of  18.9  kHz.  The  absolute  amplitude  of  each  measurement  is 
not  known  because  the  measuring  apparatus  is  not  calibrated.  However,  the  relative 
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amplitudes  can  be  compared  between  different  measurements  because  the  measuring 
apparatus  is  identical  in  all  cases. 

A  comparison  between  figures  1-23  to  1-27  shows  that  the  amplitude  of  the  acoustic 
signal  increases  with  larger  blowing  velocity.  Also,  acoustic  modes  of  higher  frequency 
are  excited  as  the  blowing  speed  increases.  It  should  be  noted  that  a  frequency  of 
1918  Hz  (see  table  1.3)  is  generated  at  the  blowing  speed  of  2420  cm/s  only  when 
the  initial  blowing  of  air  is  abrupt.  By  contrast,  a  smooth  (slow-rise)  initial  blowing 
of  air  makes  the  recorder  generate  the  lower  mode  near  1145  Hz.  Such  behavior  is 
expected  in  flue  pipes  (Verge94  [56]). 

Another  observation  is  that  the  frequencies  generated  by  the  recorder  are  related 
by  ratios  of  integers  such  as  1  :  3  :  5  :  7  :  9  which  are  characteristic  of  an  open- 
closed  pipe.  For  comparison  purposes,  table  1.4  shows  the  ideal  resonant  frequencies 
of  an  open-open  and  an  open-closed  pipe  which  is  20  cm  long.  The  ideal  resonant 
frequencies  are  based  on  the  simple  model  of  a  pipe  as  a  hnite-length  string  with 
appropriate  boundary  conditions  at  the  two  ends.  We  can  see  that  the  ideal  resonant 
frequencies  of  an  open-closed  pipe  are  similar  to  the  frequencies  generated  by  the 
flue,  but  there  are  differences.  This  is  because  the  flue  generates  acoustic  oscillations 
according  to  a  complex  nonlinear  feedback  between  the  acoustic  waves  in  the  pipe 
and  the  hydrodynamic  behavior  of  the  jet  of  air. 

Finally,  it  must  be  noted  that  the  blowing  velocities  of  1140  cm/s  and  1558  cm/s 
produce  a  sound  which  includes  a  weak  low-frequency  beat  (perhaps  10  —  20  Hz). 
This  beat  is  not  visible  in  the  frequency  spectra  shown  in  hgures  1-24  and  1-25,  but 
it  can  be  clearly  heard  by  the  human  ear.  The  low-frequency  beat  is  an  interesting 
issue  to  investigate  in  the  future,  but  is  not  critical  for  an  approximate  comparison 
between  the  simulations  and  the  physical  measurements. 
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1.4.4  Comparison  between  simulation  and  measurements 

Overall,  the  simulations  are  in  reasonable  agreement  with  the  physical  measurements. 
For  instance,  the  lowest  mode  of  400  Hz,  as  well  as  the  higher  modes  near  1200  Hz 
and  2000  Hz  are  predicted  by  the  simulations.  The  qualitative  behavior  of  jumping 
to  higher  modes  with  higher  blowing  speeds  occurs  both  in  the  simulations  and  in 
the  physical  world.  On  the  other  hand,  there  are  differences  also. 

The  major  difference  (or  cause  of  differences)  between  the  simulations  and  the 
physical  measurements  is  that  the  simulations  correspond  to  the  hrst  30-40  ms  after 
startup,  and  the  measurements  correspond  to  the  steady  state  a  few  seconds  after 
startup  (see  hgure  7-16  of  section  7.5  for  physical  measurements  of  a  startup  tran¬ 
sient).  In  this  regard,  only  a  rough  comparison  is  possible  between  the  simulations 
and  the  physical  measurements.  A  rough  comparison  is  possible  because  periodic 
oscillations  become  distinct  20  ms  after  startup,  and  the  frequencies  of  the  generated 
sound  can  be  clearly  observed. 

It  must  be  noted  that  computer  simulations  of  the  steady  state  (for  example,  one 
second  after  startup)  would  take  a  lot  longer  than  the  present  simulations.  Further¬ 
more,  a  regular  flow  pattern  exiting  the  outlet  region  would  have  to  be  established. 
To  perform  such  simulations,  improved  boundary  conditions  are  needed  for  the  outlet 
region,  as  well  as  more  compute-power,  and  perhaps  a  non-uniform  grid  to  save  on 
computational  effort.  Also,  it  should  be  noted  that  the  startup  transient  is  very  sen¬ 
sitive  to  the  details  of  the  experimental  apparatus.  Thus,  for  the  sake  of  simplicity, 
physical  measurements  of  the  steady  state  are  considered  here. 

Leaving  aside  the  issue  of  steady  state  versus  initial  response,  it  is  worth  noting 
that  the  acoustic  signal  is  much  cleaner  (pure  tones)  in  the  physical  measurements 
than  in  the  simulations.  ^  Also,  the  simulated  recorder  does  not  sing  well  at  blowing 

^The  “dip”  of  the  density  signal  in  figure  1-17  at  time  150  x  0.206  ms  is  caused  by  a  very  small 
vortex  that  reaches  the  sampling  location,  and  subsequently  moves  away.  Such  a  dip  is  expected 
because  the  density  inside  a  vortex  is  much  smaller  than  outside  (tornado  effect).  Larger  vortices 
have  a  much  more  pronounced  effect  than  the  one  shown  here.  To  avoid  such  effects,  the  acoustic 
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speed  818  cm/s,  and  the  acoustic  signal  appears  to  die  20-30  ms  after  startup  (this  is 
discussed  further  in  section  7.4).  Specihc  modeling  issues  which  may  account  for  the 
above  and  other  differences  between  the  simulations  and  the  physical  measurements 
are  as  follows: 

•  The  physical  measurements  sample  the  acoustic  signal  at  100  cm  away  from 
the  recorder,  while  the  simulations  measure  the  acoustic  signal  5  cm  above  the 
recorder. 

•  Three-dimensional  effects  are  neglected  in  the  simulations.  It  is  possible  that 
a  3D  jet  of  air  behaves  slightly  differently  than  a  2D  jet.  Also,  a  3D  resonant 
pipe  can  store  more  acoustic  energy  than  a  2D  resonant  pipe.  Thus,  an  exact 
correspondence  between  2D  and  3D  at  each  blowing  speed  may  not  be  possible. 

•  Higher  spatial  resolution  than  the  one  employed  here  (Ax  =  0.01  cm)  may  be 
needed  in  the  flue-labium  region  to  follow  the  up/down  motion  of  the  jet,  but 
perhaps  not.  A  related  issue  is  that  the  surface  of  the  labium  is  rough  at  very 
small  length  scales  Ax  =  0.01  cm  (see  hgures  1-8  and  1-7).  The  roughness  of 
the  labium  may  affect  the  shedding  of  vortices.  However,  it  is  probably  a  minor 
issue  at  the  length  scale  of  Ax  =  0.01  cm,  and  it  diminishes  with  smaller  Ax. 

•  The  walls  of  the  outlet  region  near  and  above  the  labium  reflect  acoustic  waves. 
Such  walls  are  not  present  in  the  physical  experiments.  It  is  possible  that  the 
reflections  from  the  walls  influence  the  operation  of  the  flue.  However,  I  expect 
that  the  effect  is  small  because  the  very-top  boundary  of  the  outlet  region  does 
not  reflect  acoustic  waves  (where  the  flow  exits  from  the  simulation). 

•  The  walls  of  the  outlet  region  may  affect  the  buildup  of  hydrodynamic  pressure 
gradients  above  the  flue.  The  operation  of  the  flue  is  very  sensitive  to  the 
surrounding  pressure  gradients. 


signal  should  not  be  sampled  very  near  and  above  the  labium  where  vorticity  is  shed. 
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•  The  limited  size  and  the  two-dimensional  form  of  the  outlet  region  encourage  the 
accumulation  of  vortices  right  above  the  labium.  The  vortices  introduce  hydro- 
dynamic  pressure  gradients,  and  may  interfere  with  the  oscillations  of  the  jet. 
By  contrast,  in  the  physical  world  (practically  inhnite  and  three-dimensional) 
the  generated  vorticity  is  quickly  carried  away  from  the  sensitive  region  of  the 
flue  and  labium.  In  the  simulations,  the  vorticity  can  not  move  away  so  easily. 

Anyone  of  the  above  issues,  or  a  combination  of  them  may  be  responsible  for  the 
differences  between  the  simulations  and  the  physical  measurements.  However,  the 
most  important  issue  seems  to  be  the  modeling  of  the  outlet  region.  Future  work 
should  be  done  along  the  following  directions: 

•  Improve  the  boundary  conditions  at  the  outlet. 

•  Devise  suitable  means  of  clearing  the  outlet  region  from  accumulated  vorticity. 

•  Employ  non-uniform  grid  to  enlarge  the  outlet  region  without  incurring  a  large 
computational  cost. 

Despite  the  differences  between  the  simulations  and  the  physical  measurements,  the 
results  are  very  good  as  a  hrst  step. 
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Figure  1-15:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  818  cm/s. 


Figure  1-16:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  1104  cm/s. 
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Figure  1-17:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  1535  cm/s. 


Figure  1-18:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  1995  cm/s. 
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Figure  1-19:  Compressible  finite  difference  method,  20  cm  closed-end 
recorder,  blowing  velocity  838  cm/s. 
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soprano 


Figure  1-20:  Compressible  finite  difference  method,  20  cm  closed-end  soprano 
recorder,  blowing  velocity  1113  cm/s. 
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Figure  1-21:  Compressible  finite  difference  method,  20  cm  closed-end  soprano 
recorder,  blowing  velocity  1634  cm/s. 


0.20578  msec 

Figure  1-22:  Compressible  finite  difference  method,  20  cm  closed-end  soprano 
recorder,  blowing  velocity  2082  cm/s. 
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Figure  1-23:  Physical  measurements,  steady  state,  20  cm  closed-end  soprano  recorder, 
blowing  velocity  734  cm/s.  Arbitrary  units  of  amplitude. 


0  50  100  150 

0.20578  ms 


Figure  1-24:  Physical  measurements,  steady  state,  20  cm  closed-end  soprano  recorder, 
blowing  velocity  1140  cm/s.  Arbitrary  units  of  amplitude. 
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Figure  1-25:  Physical  measurements,  steady  state,  20  cm  closed-end  soprano  recorder, 
blowing  velocity  1558  cm/s.  Arbitrary  units  of  amplitude. 


Figure  1-26:  Physical  measurements,  steady  state,  20  cm  closed-end  soprano  recorder, 
blowing  velocity  1985  cm/s.  Arbitrary  units  of  amplitude. 
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Figure  1-27:  Physical  measurements,  steady  state,  20  cm  closed-end  soprano  recorder, 
blowing  velocity  2420  cm/s,  and  abrupt  blow  of  air  at  startup.  Arbitrary  units  of 
amplitude. 


Chapter  2 

The  motion  of  fluids 


In  this  chapter,  the  partial  differential  equations  of  fluid  flow,  known  as  the  Navier 
Stokes  equations,  are  derived  in  the  context  of  phenomena  such  as  the  flow  of  air 
at  room  temperature  and  atmospheric  pressure.  In  addition,  an  introduction  to  hy¬ 
drodynamics  and  acoustics  is  presented  which  is  useful  background  material.  Most 
of  the  results  of  this  chapter  are  not  really  new  as  one  can  infer  from  the  references 
to  previous  work.  However,  the  results  are  re-derived  here  and  presented  in  a  novel 
way  with  extra  care  to  be  correct  and  relevant  to  physical  reality.  In  addition,  some 
discussions  such  as  the  paradox  of  incompressibility  in  section  2.4.3  and  the  justih- 
cation  of  omitting  the  bulk  viscosity  in  subsonic  flow,  can  not  be  found  easily  in  the 
literature  as  far  as  I  know. 


2.1  The  scale  of  macroscopic  flow 

A  fluid  can  be  modeled  either  at  the  microscopic  level  or  at  the  macroscopic  level. 
Here,  the  flow  of  a  fluid  is  modeled  at  the  macroscopic  level  where  “macroscopic” 
means  that  the  fluid  is  viewed  as  a  continuum  and  that  the  underlying  molecular 
motion  is  not  considered  directly.  In  particular,  it  is  assumed  that  an  inhnitesimal 
volume  of  fluid  can  be  dehned  which  is  very  large  compared  to  the  microscopic  scales  of 
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molecular  motion,  and  simultaneously  very  small  compared  to  the  macroscopic  scales 
of  fluid  flow  (Batchelor  [3,  p.4]  and  Tritton  [54,  p.48]).  Thus,  microscopic  statistical 
fluctuations  are  ignored,  and  the  state  of  the  fluid  is  dehned  as  a  continuous  function 
of  space  and  time. 

The  above  discussion  can  be  made  more  precise  by  considering  some  numbers. 
The  diameter  of  an  air  molecule  (modeled  as  a  hard  core  sphere  or  billiard  ball) 
is  of  the  order  3  X  10“®  cm  (Batchelor  [3,  p.3],  Skordos&Zurek  [49,  p.878]).  The 
mean  free  path  (average  distance  traveled  by  a  molecule  between  collisions)  is  of  the 
order  10“®  cm  at  room  temperature  and  atmospheric  pressure.  The  smallest  length 
scale  where  the  macroscopic  fluid  dynamics  can  be  safely  employed  is  about  10“^  cm, 
namely,  100  times  the  mean  free  path.  Occasionally,  macroscopic  fluid  dynamics  (the 
Navier  Stokes  equations)  are  employed  at  length  scales  as  small  as  the  mean  free 
path,  for  example,  in  ultrasonic  acoustics  (Morse&Ingard  [33]).  However,  there  is  no 
reason  to  consider  such  small  length  scales  here,  and  10“^  cm  will  be  assumed  to  be 
the  smallest  length  scale  of  interest.  It  should  be  noted  that  an  acoustic  wavelength 
of  10“^  cm  corresponds  to  an  acoustic  frequency  of  34  MHz. 

2.2  The  conservation  laws 

The  three  most  important  properties  of  fluid  flow  are  the  conservation  of  mass,  mo¬ 
mentum,  and  energy.  These  conservation  properties  arise  from  the  underlying  molec¬ 
ular  dynamics  of  fluids,  and  they  are  inherited  by  the  macroscopic  dynamics.  The 
conservation  properties  are  so  powerful  that  one  can  derive  the  Navier  Stokes  equa¬ 
tions  by  imposing  conservation  at  the  microscopic  level,  and  by  performing  macro¬ 
scopic  averaging  of  the  microscopic  dynamics  (Huang  [27]).  Such  a  derivation  is  called 
the  kinetic  theory  approach.  A  simplihed  version  of  kinetic  theory  can  be  found  in 
section  4.1.2,  where  it  is  shown  that  the  lattice  Boltzmann  method  approximates  the 
Navier  Stokes  equations  through  a  kinetic  theory  expansion  known  as  the  Chapman- 
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Enskog  expansion. 

Besides  the  kinetic  theory  approach,  another  way  of  deriving  the  Navier  Stokes 
equations  is  to  assume  that  the  conservation  of  mass,  momentum,  and  energy  apply 
directly  at  the  macroscopic  level.  Specihcally,  an  inhnitesimal  but  macroscopic  volume 
of  fluid  (called  a  fluid  element)  is  considered,  and  its  evolution  in  time  is  examined. 
The  mass  of  the  fluid  element  must  remain  constant  as  the  fluid  element  moves  with 
the  flow.  The  momentum  and  energy  may  change  as  a  result  of  interactions  with  the 
surrounding  fluid  elements,  but  the  interactions  must  conserve  the  total  momentum 
and  energy.  By  considering  small  changes  during  a  sufficiently  small  interval  of  time, 
a  set  of  partial  differential  equations  can  be  derived  which  describe  the  evolution  of 
mass,  momentum,  and  energy  of  individual  fluid  elements. 

An  important  simplihcation  in  deriving  the  macroscopic  equations  of  fluid  flow  is 
to  introduce  flow  variables  (density,  velocity,  and  temperature)  which  are  functions 
of  space  and  time.  The  flow  variables  are  an  alternative  way  of  describing  the  flow 
as  opposed  to  the  mass,  momentum,  and  energy  of  individual  fluid  elements  .  The 
two  approaches  are  equivalent.  For  instance,  by  integrating  the  values  of  the  flow 
density  and  velocity  inside  a  given  volume  of  space  at  a  particular  point  in  time,  we 
can  obtain  the  mass  and  the  momentum  of  a  fluid  element  that  corresponds  to  the 
volume  of  space  under  consideration  at  that  particular  time. 

The  flow  variables  are  simpler  to  use  than  the  mass,  momentum,  and  energy  of 
individual  fluid  elements  because  the  flow  variables  are  dehned  on  a  hxed  coordinate 
system,  and  do  not  move  with  the  flow  as  the  fluid  elements  do  (Morse&Ingard  [33, 
p.235],  Batchelor  [3,  p.71].  Lamb  [31,  p.l2]).  When  the  description  of  a  flow  is  based 
on  the  flow  variables  only,  it  is  called  Eulerian.  Alternatively,  when  the  description 
of  a  flow  refers  to  the  properties  of  individual  fluid  elements,  it  is  called  Lagrangian. 
Most  texts  in  fluid  mechanics  follow  the  Eulerian  description,  and  this  will  be  done 
here  also. 

Below,  the  Navier  Stokes  equations  are  derived  using  the  ideas  outlined  above.  For 
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this  purpose,  the  fluid  density  p[x^  y  and  the  fluid  velocity  y,  z,  t)  are  intro¬ 
duced  as  continuous  functions  of  space  and  time,  where  the  components  of  the  fluid 
velocity  correspond  to  the  Cartesian  directions  x^y^z  for  j  =  1,2,3  respectively. 
Also,  the  advective  derivative  D j Dt  is  introduced  as  follows. 


-  IL  y.  JL 

Dt  dt  ^  dx, 


(2.1) 


where  the  Einstein  summation  convention  is  used:  When  an  index  appears  twice  in 
the  same  term,  a  summation  is  automatically  implied.  The  notation  Xj  stands  for 
X,  y,  z  when  j  =  1,2,3.  The  advective  derivative  is  a  special  case  of  the  total  derivative 
of  a  variable  which  is  a  function  of  x,  y,  z,  t  under  the  following  assumption. 


dt 


V, 


(2.2) 


The  above  assumption  is  true  in  the  case  of  a  fluid  element  which  moves  with  the  local 
velocity  Vj  of  the  flow.  It  turns  out  that  the  advective  derivative  is  omnipresent  in 
fluid  mechanics,  and  it  is  worth  reserving  the  symbol  D  j  Dt  to  refer  to  the  advective 
derivative  (Batchelor  [3,  p.73]). 


2.2.1  Mass  conservation 


First,  the  mass  conservation  equation  is  derived,  which  is  also  known  as  the  mass 
continuity  equation.  We  consider  a  fluid  element  which  is  positioned  at  x,  j/,  2;  at  time 
t,  and  has  volume  A(a;,  j/,  2;,  t).  The  mass  of  the  fluid  element  is  conserved  and  is 
equal  to  pA.  Therefore,  the  total  derivative  of  the  mass  must  be  zero,  or  actually  the 
advective  derivative  D  j  At  must  be  zero  because  the  fluid  element  moves  with  the 
local  velocity  Vj  of  the  flow.  Thus, 

ipA)  =  0  (2,3) 


which  gives. 


Dt 


+  P 


DA 

~Dt 


0 


(2.4) 


CHAPTER  2.  THE  MOTION  OE  EL UIDS 


49 


and 


Dp  f  1  DA\ 

=» 


(2.5) 


To  proceed  further,  we  need  to  express  the  relative  change  of  the  volume  of  the  fluid 
element  (f/A  DAj  Dt)  in  terms  of  the  flow  variables.  As  we  will  see  below,  the  relative 
change  of  the  volume  of  the  fluid  element  (also  known  as  dilatation)  is  equal  to  the 
divergence  of  the  fluid  velocity. 


1  DA 
~A~Dt 


dv^ 

dx^ 


(2.6) 


To  prove  equation  2.6,  we  examine  how  the  geometry  of  the  fluid  element  distorts  as 
the  fluid  element  moves  with  the  flow.  Following  Lamb  [31,  p.5],  we  consider  a  cubic 
fluid  volume  such  as  the  one  shown  in  hgure  2-1.  We  assume  that  the  six  faces  of  the 
cubic  volume  are  initially  aligned  with  the  axes  of  the  coordinate  system.  The  center 
of  the  volume  is  located  at  some  point  (xi,  2:2,  X3),  and  the  volume  has  dimensions 
(Axi,  Aa;2,  Axs).  The  two  faces  of  the  cube  that  are  opposite  each  other  along  the 
Xi  direction  are  referred  to  as  the  Xi-faces  of  the  cube,  and  they  are  located  at 


(Xl  ±  ^^,X2,X3) 


(2.7) 


If  the  fluid  velocity  is  equal  to  (Id,  W,  W)  at  the  center  point  (xi,  2:2,  x^)  of  the  cube, 
then  the  Xi-faces  are  moving  outwards  (expanding)  with  the  following  velocities  along 
the  Xi  direction. 


V7  + 


-  fd- 


dVi  Axid 

) 

dVi  Axi^ 


(2.8) 


(2.9) 


dxi  2  ) 

The  above  quantities  express  the  change  of  volume  along  the  Xi  direction.  The 
motion  of  the  Xi-faces  along  the  X2  and  x^,  directions  produces  shearing  of  the  volume 
only,  and  does  not  change  the  volume  to  hrst  order  in  the  differential  quantities 
Axi,  Nx2,  Axs.  Thus,  we  can  ignore  the  shearing  motion  here.  After  an  inhnitesimal 
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interval  of  time  At  has  elapsed,  the  change  of  volume  due  to  expansion  along  the  Xi 
direction  is  equal  to 

^  Ax2AxsAt  (2.10) 

Similar  relations  can  be  obtained  for  the  expansion  along  the  X2^X3  directions  using 
the  other  faces  of  the  cubic  volume.  The  total  rate  of  change  of  volume  per  unit  of 
time  (expanding  volume)  is  given  by  the  sum  of  the  above  terms, 


(2.11) 


Dt  \dxi  dx2  dxs^ 

Combining  equation  2.11  with  equation  2.5  and  the  fact  that  A  =  Aa;iAa;2Aa;3  we 
obtain. 


Dp  dV, 

where  the  summation  convention  is  used.  We  also  use  the  notation, 

§W.(v.r)  =  o 


(2.12) 


(2.13) 


The  above  is  the  mass  continuity  equation.  We  have  derived  it  by  considering  the 
conservation  of  mass  of  a  moving  fluid  element  during  an  inhnitesimal  interval  of  time, 
and  by  relating  the  mass  of  the  fluid  element  to  the  Eulerian  density  and  velocity  of 
the  flow. 

An  alternative  way  of  deriving  the  mass  continuity  equation  is  to  consider  a  hxed- 
in-space  volume  of  fluid,  and  to  balance  the  mass  which  flows  through  the  boundaries 
of  the  volume  with  the  change  of  density  inside  the  volume.  This  alternative  approach 
is  found  in  Landau&Lifshitz  [32,  p.l]  and  Batchelor  [3,  p.74],  and  it  produces  the 
following  equation, 

|(  +  V.(W)  =  0  (2.14) 

which  is  equivalent  to  equation  2.13.  In  my  opinion,  the  approach  of  the  moving 
fluid  volume  is  somewhat  more  intuitive  than  the  hxed-in-space  volume  because  it  is 
easier  to  visualize  what  happens  when  the  fluid  volume  moves  and  distorts  with  the 
flow.  On  the  other  hand,  the  use  of  both  approaches  leads  to  a  better  understanding 
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Figure  2-1:  A  fluid  element  whose  shape  is  a  cube  at  time  zero,  and  its  six  faces  are 
normal  to  the  Cartesian  axes. 

than  either  one  by  itself.  It  should  be  noted  that  the  hxed-in-space  volume  is  a  purely 
Eulerian  approach;  while  the  moving  fluid  volume,  as  described  above,  is  a  Lagrangian 
idea  expressed  in  Eulerian  flow  variables. 


2.2.2  Momentum  conservation 


The  momentum  Navier  Stokes  equation  can  be  derived  in  a  similar  way  to  the  mass 
conservation  equation  by  considering  the  changes  of  momentum  of  a  fluid  element 
during  an  inhnitesimal  interval  of  time.  If  we  consider  the  forces  acting  on  the  six 
faces  of  a  cubic  volume,  we  can  write  an  equation  for  the  conservation  of  momentum 
along  the  Xj  direction,  as  follows. 


Djpy])  ^  d{ajk) 
Dt  dxk 


(2.15) 


where  a^k  is  called  the  pressure  tensor,  and  it  models  the  forces  that  arise  from 
pressure  and  from  viscosity  (internal  friction  of  the  fluid  medium).  The  derivation 
of  the  pressure  tensor  is  somewhat  long  and  is  omitted  here.  The  details  can  be 
found  in  standard  textbooks  such  as  Landau&Lifshitz  [32,  p.45],  Batchelor  [3,  p.l47]. 
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Newman  [34,  pp. 50-63].  These  references  show  that  the  pressure  tensor  can  be  written 
as  follows, 

<T,k  =  -P6,k  +  ’((5^  +  11)  +  i-jv  +  Oi^-V)S„  (2,16) 

where  P  is  the  scalar  pressure,  t]  is  the  hrst  coefficient  of  viscosity  (corresponding  to 
friction  from  shearing  motion),  and  (  is  the  second  coefficient  of  viscosity  (correspond¬ 
ing  to  friction  from  bulk-expanding  motion).  The  above  tensors  can  be  represented 
in  a  Cartesian  coordinate  system  in  terms  of  3  X  3  matrices  as  follows, 

1  0  0 

hjfc  =  <  0  1  0  >  (2.17) 

0  0  1 

^dVi  ^  ^  dVi  dVs 

dx  dy  dx  dz  dx 

^  +  ^  2—  ^^2  ^  dVs 

dx  dy  dy  dz  dy 

cm  m  cm  ^  d\A 

dx  dz  dy  dz  dz 


Further,  the  following  identities  are  useful, 

d  dP 

+  =  (AIlA  + 

dxk  \dxk  dxj  J  \dxkdxk )  dxj  \dxk ) 

(2.20) 

The  above  identities  can  be  used  to  write  the  momentum  Navier  Stokes  equations  in 
the  following  form. 


D(pVi) 

Dt 


(2.21) 
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where  j  =  1,2,3.  The  physical  interpretation  of  the  various  terms  of  the  above 
momentum  equation  will  be  discussed  in  section  2.4.  Next,  the  conservation  of  energy 
is  examined. 


2.3  Adiabatic  variations  of  temperature 

In  general,  the  simulation  of  viscous  flow  with  acoustic  waves  requires  the  complete 
Navier  Stokes  equations:  the  mass  continuity  equation,  three  equations  for  momen¬ 
tum  conservation,  an  equation  for  energy  conservation,  and  an  equation  of  state  which 
relates  the  three  thermodynamic  variables  (temperature,  pressure,  and  density).  The 
temperature  represents  the  internal  energy  of  the  fluid,  and  arises  from  the  internal 
degrees  of  freedom  such  as  the  vibrations  of  the  fluid  molecules.  The  energy  equa¬ 
tion  couples  together  the  temperature  variations  with  the  density  and  momentum 
variations  of  the  flow. 

In  special  cases,  such  as  the  flow  of  air  at  room  temperature  and  atmospheric 
pressure,  the  coupling  between  the  temperature  and  the  momentum  of  the  flow  is 
very  small  and  can  be  neglected.  In  particular,  the  partial  differential  equation  for 
energy  conservation  can  be  replaced  with  an  exact  relation  between  the  temperature, 
density,  and  pressure.  This  relation  is  called  the  adiabatic  approximation,  and  it 
is  employed  in  the  simulations  presented  here,  in  order  to  avoid  solving  a  partial 
differential  equation  corresponding  to  the  conservation  of  energy. 

In  the  adiabatic  approximation,  it  is  assumed  that  there  is  no  conduction  of  heat 
between  different  parts  of  the  flow.  In  addition,  it  is  assumed  that  there  are  local 
heat  reservoirs  at  each  point  in  space  which  allow  local  temperature  oscillations,  but 
without  any  conduction  of  heat.  The  local  heat  reservoirs  are  necessary  because  the 
density  fluctuations  of  acoustic  waves  are  accompanied  by  small,  but  non-negligible 
temperature  fluctuations.  Namely,  when  the  air  suddenly  compresses,  its  temperature 
rises;  when  the  air  expands,  its  temperature  lowers. 
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The  justification  for  the  adiabatic  approximation  is  easy  to  understand  in  the  case 
acoustic  oscillations:  the  acoustic  oscillations  happen  very  fast,  so  it  makes  sense  to 
assume  that  there  is  no  conduction  of  heat.  However,  the  adiabatic  approximation 
applies  more  generally  as  we  shall  see  afterwards.  First,  let  us  derive  an  exact  relation 
between  the  temperature,  density,  and  pressure  by  considering  the  temperature  fluc¬ 
tuations  of  acoustic  waves.  This  idea  is  due  to  Laplace,  and  is  explained  very  nicely 
in  Rayleigh’s  book  [42,  p.20].  Mathematically,  we  dehne  Po,/Oo,^o  the  initial  values 
of  pressure,  density,  and  temperature,  and  P,  p,  6  the  new  values  after  an  adiabatic 
change.  Then,  the  following  relation  applies  which  is  known  as  the  adiabatic  law  (see 
section  2.3.1  for  a  derivation). 


P 


.do. 


7 


7 


MV7-1 


(2.22) 


where  7  is  the  ratio  of  the  specihc  heats  of  the  gas,  and  it  is  equal  to  1.4  in  the  case 
of  air.  We  also  dehne  the  small  variations  P',  p',  9'  around  the  constant  mean  values 
Po,  po,  do  as  follows, 

P  =  Po  +  P' 

P  =  Po  E  p'  (2.23) 

d  =  do  +  d' 

We  can  obtain  a  relation  between  the  variations  P'  and  9'  by  expanding  the  following 

sum  to  hrst  order  in  small  quantities. 


P  P'  (  p'V  P' 

—  -  1  +  7T  “  - )  —  ^  +  7 

Po  Po  V  Po)  Po 


(2.24) 


Therefore, 

P'  =  p'  (2.25) 

To  proceed  further,  we  use  the  equation  of  state  for  gases,  which  is  a  relation  between 
the  mean  values  of  the  thermodynamic  variables. 


Po  = 


R  Po  do 


(2.26) 
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where  Oq  is  expressed  in  absolute  degrees  Kelvin,  and  i?  is  a  gas  constant  which  is 
equal  to  2.870  X  10®  cm^/s^  per  degree  Kelvin  in  the  case  of  air  (Batchelor  [3,  p.43] 
and  Lamb  [31,  p.478]).  Equations  2.25  and  2.26  give 

P'  =  {^Re^)p'  (2.27) 

which  can  be  written  as 

P'  =  c\l>'  (2.28) 

with  the  dehnition 

c.  =  (2.29) 

The  constant  is  the  speed  of  the  propagation  of  acoustic  waves  as  we  will  see  in 
section  2.5.  The  precise  relation  between  pressure  and  density  is  as  follows, 

P  =  CjP  +  (To  —  CjPo)  (2.30) 

For  the  sake  of  simplicity,  the  following  formula  is  commonly  used  (throughout  this 
work  and  elsewhere), 

P  =  (2.31) 

with  the  understanding  that  it  is  okay  to  subtract  an  arbitrary  offset  from  the  pressure 
because  only  the  gradients  of  the  pressure  influence  the  flow. 

Above,  an  exact  relation  between  the  density  and  the  pressure  has  been  derived 
by  examining  the  adiabatic  changes  of  pressure,  density,  and  temperature  of  acoustic 
waves.  It  turns  out  that  the  adiabatic  approximation  applies  more  generally  to  any 
variations  of  density  in  subsonic  flow  as  long  as  the  variations  are  small.  The  reason 
is  as  follows.  Let  us  consider  a  steady  flow  inside  a  pipe  (Hagen-Poiseuille  flow,  Lan- 
dau&Lifshitz  [32,  p.51]),  and  let  us  ask  whether  the  relation  between  pressure  and 
density  variations  P  =  still  applies.  The  answer  is  yes,  and  the  adiabatic  law 
still  applies  because  what  is  important  is  how  the  state  of  equilibrium  is  reached. 
Any  disturbance  in  the  fluid  is  transmitted  by  fast  acoustic  waves,  so  that  the  new 
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state  of  equilibrium  is  reached  quickly  and  adiabatically  to  a  good  approximation. 
Accordingly,  the  relation  between  small  variations  of  pressure  and  density  which  is 
derived  above  applies  in  general  for  any  variations  of  density  in  subsonic  flow. 

For  historical  interest,  it  should  be  noted  that  Laplace  proposed  the  adiabatic  law 
between  pressure  and  density  in  order  to  calculate  the  speed  of  sound  using  equa¬ 
tion  2.29.  Before  Laplace’s  formula,  the  previous  estimate  of  the  speed  of  sound  fell 
short  of  experimental  measurements.  The  previous  estimate,  attributed  to  Newton, 
assumed  Boyle’s  law  of  inhnitely  slow  changes  at  constant  temperature. 


(2.32) 


Pq  Po 

which  misses  the  constant  factor  7  so  that  the  speed  of  sound  comes  out  short  by  a 
factor  =  1.18. 


2.3.1  Derivation  of  the  adiabatic  law 


For  completeness,  a  derivation  of  the  adiabatic  law  (equation  2.22)  is  presented  here, 
which  follows  closely  the  derivation  of  Rayleigh  [42,  p.21].  First,  the  equation  of  state 
for  gases  is  considered  which  relates  the  three  thermodynamic  variables  pressure  P, 
density  p,  and  temperature  0, 

P  =  pRe  (2.33) 


This  can  also  be  written  as. 


PA  =  R'e 


(2.34) 


where  A  is  the  volume  under  consideration,  and  is  related  to  the  density  p  as  follows, 

dA  dp 

A  p 


(2.35) 


The  new  gas  constant  R'  is  equal  to  the  original  gas  constant  R  times  the  mass  of  the 
volume  under  consideration.  Differentiation  of  equation  2.34  produces  the  differential 
equation  of  state  which  will  be  used  below. 


dP  dA  dO 


(2.36) 
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First,  it  is  noted  that  the  equation  of  state  constraints  the  three  thermodynamic 
variables  P,  A,  0,  any  two  of  them  can  be  taken  as  independent  variables,  for  example 
P  and  A.  Further,  if  an  additional  constraint  is  introduced,  it  may  be  possible 
to  obtain  an  exact  relation  between  the  thermodynamic  variables  because  only  one 
variable  will  then  be  independent.  The  additional  assumption  is  that  there  is  no 
communication  of  heat  in  the  medium. 

In  order  to  exploit  the  assumption  of  no  communication  of  heat,  we  examine  the 
amount  of  heat  in  the  fluid  volume  as  a  function  of  the  pressure  P  and  the  volume 
A.  If  the  amount  of  heat  is  denoted  Q,  the  following  total  differential  expresses  the 
conduction  of  heat  in  terms  of  changes  in  pressure  and  volume. 


dQ 


dP 


(2.37) 


The  above  equation  can  be  simplihed  by  considering  changes  of  heat  under  constant 
pressure,  dP  =  0,  and  also  changes  of  heat  under  constant  volume,  dA  =  0.  In 
particular,  using  the  differential  equation  2.36,  the  following  relations  can  be  obtained. 


= 


dQ 

dQ' 


dQ\  I  dA 
TAi  [le 


A 

Ja]  ? 


(2.38) 

(2.39) 


_  ^  (dQ\  (dP\  ^  (dQ\  P 

je ) ^  \dp )\de)  \dp )  e 

The  above  quantities  are  the  ratios  of  changes  in  heat  divided  by  the  changes  in 
temperature  under  constant  pressure  and  under  constant  volume.  They  are  called 
specihc  heats,  and  they  are  constant  within  a  wide  range  of  temperatures  and  pres¬ 
sures  (Batchelor  [3,  p.44]).  They  are  certainly  constant  for  the  purpose  of  modeling 
air  flow  inside  flue  pipes.  Using  the  above  relations  together  with  the  assumption 
that  there  is  no  conduction  of  heat,  dQ  =  0,  equation  2.37  becomes. 


dQ 


Kp  —  I  dA  + 


dP  =  0 


(2.40) 


dp 

Up 

P 


dP 


or 


0 


(2.41) 
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which  gives 

dP  I Kp\  P 

dp  \KyJ  p 

or  equivalently, 

dP  P 

with  the  appropriate  dehnition  of  the  constant  7, 

7  =  KpjKy 


(2.42) 

(2.43) 

(2.44) 


By  performing  an  integration  of  equation  2.43  using  logarithms,  the  adiabatic  law  is 


obtained, 

(2-45) 

Po  \Po/ 

where  Pq^Oq  are  two  initial  values.  This  is  the  adiabatic  law  of  equation  2.22  which 


we  wanted  to  prove. 

We  recall  that  the  adiabatic  law  of  equations  2.22  and  2.45  was  the  starting  point 
for  calculating  the  relation  between  small  variations  of  pressure  and  density  P'  =  c^p' ■ 
Here,  we  can  also  see  that  an  alternative  way  of  deriving  the  relation  P'  =  clp'  would 
be  to  assume  small  variations  P\  p'  around  an  initial  point  Po^po  in  equation  2.43 


and  write. 


dP  _  P  ^  Po 

dp  ^  p  ^  Pq 


An  integration  that  involves  the  small  variations  P',  p'  gives 

=  (w)"' 


(2.46) 


(2.47) 


This  is  the  same  relation  between  small  variations  of  pressure  and  density  which  was 
derived  previously  in  equation  2.25. 

Armed  with  an  exact  relation  between  the  pressure  and  the  density,  we  can  pro¬ 
ceed  in  the  following  sections  to  analyze  the  physical  properties  of  the  Navier  Stokes 
equations. 
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2.4  The  Navier  Stokes  equations 


The  Navier  Stokes  equations  which  I  use  to  model  air  flow  inside  flue  pipes,  can  be 
written  compactly  as  follows, 


D{pV,)  ,  2  dp 


d{V-V) 


Dt 


+  C  vr  —  pp  -  —  0 


dx. 


dx. 


(2.48) 


(2.49) 


Uj 

They  are  partial  differential  equations  which  express  the  conservation  of  mass  and 
momentum,  and  must  be  solved  numerically.  In  addition,  there  is  an  exact  relation 
relation  between  the  temperature,  the  density,  and  the  pressure  according  to  the 
adiabatic  law.  This  relation  completes  the  physical  model,  and  replaces  a  partial 
differential  equation  for  energy  conservation  as  explained  in  the  previous  section. 
The  adiabatic  relation  between  pressure  and  density  variations  is  as  follows. 


P  =  cIp 


(2.50) 


Regarding  notation,  the  index  j  in  the  Navier  Stokes  equations  runs  between  j  = 
1,2,3  .  The  symbol  D  j  Dt  is  the  advective  derivative,  and  is  the  speed  of  sound. 
The  coefficients  v  and  p  are  density-normalized  viscosity  coefficients  which  are  dehned 
as  follows. 


V  ??/3  +  C 

p  =  - 

P  P 


(2.51) 


where  t]  and  (  are  the  un-normalized  viscosity  coefficients  dehned  in  section  2.2.  The 
coefficients  p  and  p  will  be  used  from  now  on,  and  they  are  called  kinematic  and  bulk 
viscosity  respectively. 


The  above  partial  differential  equations  can  be  written  in  expanded  form  as  follows. 


dp  d{pv,)  djpVy)  d{pv,) 

dt  dx  dy  dz 


(2.52) 


djpvz  d(pv,vz  dipv^v,)  djpVX)  djclp) 

dt  dx  dy  dz  dx 


z/p  -  pp 


d{V  •  V) 
dx 


=  0 

(2.53) 
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d{pVy)  ,  dipV^Vy)  ,  dipVyVy)  ,  8  { pVyV,)  , 


+ 


+ 


+ 


+ 


dt  dx  dy  dz  dy 

d{pv,)  ,  d{pv,v,)  ,  dipVyV,)  ,  d{pv,v,)  ,  d{clp) 


—  up  V^Vy  —  PP 


d{v  •  y) 

dy 


dt 


+ 


dx 


+ 


dy 


+ 


dz 


+ 


dz 


=  0 

(2.54) 

=  0 


dz 


(2.55) 


where  p^Vx^Vy^Vz  are  the  fluid  density  and  the  components  of  the  fluid  velocity  in 
the  x,y,z  directions  respectively.  The  expanded  form  of  the  Laplacian  operator  is 
as  follows. 


V  = 


d^  d^ 


d^ 


dx'^  dy'^  dz'^ 


(2.56) 


A  simplified  form  of  the  Navier  Stokes  equations  can  be  obtained  by  omitting  the 
bulk  viscosity  term  ppd{'V  ■  V)ldx  because  it  is  very  small  in  the  case  of  subsonic 
flow  (see  section  2.4.2).  Also,  the  continuity  equation  2.52  can  be  subtracted  from 
each  one  of  the  momentum  equations  2.53-2.55,  and  the  equations  can  be  divided  by 
the  density  p.  The  resulting  equations  have  the  following  form, 

dp  ,  d{pVx)  ,  d{pVy)  ,  d{pVz) 


dx 


+ 


dy 


+ 


dz 


=  0 


dVx  .dVx  .dVx  .dVx 


— ^  +  V  — ^  +  V  — ^ 
dt  "  dx  ^  dy 


+  Vz 


+  -  //VV.  =  0 


dt 

dVz 


dx 


k  _L  y _ 

^  dy  "  dz 


dz  p  pd 


dVx 


X 


y 


+  =  0 

p  pdy 

+  V.N  +  vN  +  +  IN  -  =  0 


(2.57) 

(2.58) 

(2.59) 

(2.60) 


dt  '  dx  '  ^  dy  '  dz  '  p  pdz 

The  next  section  discusses  the  significance  of  the  shear  and  the  bulk  viscosity  terms. 


2.4.1  Shear  viscosity 

The  coefficient  u  that  appears  in  the  Navier  Stokes  equations  is  called  the  kinematic 
viscosity.  It  is  equal  to  the  first  coefficient  of  viscosity  t]  divided  by  the  mean  density 
of  the  fluid  medium.  The  coefficient  t]  varies  very  slowly  with  temperature,  and 
the  coefficient  u  varies  very  slowly  both  with  temperature  and  with  density.  The 
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Figure  2-2:  A  vortex  forms  when  the  flow  bends  around  a  sharp  corner.  If  the  flow 
speed  is  large,  the  vortex  may  separate  and  move  away  with  the  flow,  while  new 
vortices  are  being  formed  in  its  place. 

variations  are  so  small,  however,  that  they  are  ignored  here.  Thus,  p  is  assumed  to 
be  constant.  The  value  of  p  at  selected  temperatures  is  given  in  section  2.6. 

Physically,  the  p  term  corresponds  to  friction,  and  it  expresses  the  loss  of  momen¬ 
tum  due  to  shearing  forces  in  the  fluid.  For  example,  when  two  layers  of  fluid  slide 
over  each  other  with  opposing  velocities,  or  when  a  layer  of  fluid  is  moving  over  a  flat 
plate  that  is  stationary  with  respect  to  the  flow  (hgure  2-4),  the  p  term  is  responsible 
for  decelerating  the  neighboring  layers  of  the  fluid  that  move  with  different  speeds. 
Generally,  the  p  term  is  responsible  for  smoothing  and  diffusing  differences  in  the 
velocity  of  the  fluid. 
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Figure  2-3:  The  boundary  layer  around  a  jet  curls  up  and  forms  vortices  that  separate 
from  the  main  jet. 

An  important  property  of  viscous  fluids  is  that  when  the  p  coefficient  is  very  small, 
the  deceleration  of  the  fluid  due  to  viscosity  occurs  in  small  regions  that  are  called 
boundary  layers  (Newman  [34,  pp. 70-68]).  Inside  a  boundary  layer  the  flow  velocity 
changes  very  rapidly  from  one  value  to  another,  which  makes  the  velocity  gradients 
very  large,  and  thus  the  p  term  of  the  Navier  Stokes  equations  large  enough  that  it 
can  not  be  neglected.  Figure  2-4  shows  the  boundary  layer  above  a  flat  plate,  where 
the  plate  is  stationary  with  respect  to  a  fast-moving  flow.  The  speed  of  the  fluid 
changes  from  zero  at  the  surface  of  the  flat  plate  to  some  large  value  away  from  the 
plate  on  the  other  side  of  the  boundary  layer. 
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In  contrast  to  the  above  discussion  of  narrow  boundary  layers,  I  must  clarify  that 
very  thick  boundary  layers  are  also  possible,  although  in  this  case  the  name  “boundary 
layer”  is  usually  avoided.  In  particular,  the  boundary  layers  can  grow  slowly  in  space 
and  in  time  by  diffusion,  so  that  they  can  become  very  large  in  steady  flow  if  the  solid 
boundary  extends  for  a  long  distance,  and  if  sufficient  time  elapses  for  the  boundary 
layer  to  grow  from  an  initial  non-moving  state.  An  example  of  this  situation  is  the 
Hagen-Poiseuille  flow  (Newman  [34,  p. 63-85],  Landau&Lifshitz  [32,  p.51])  inside  long 
pipes,  where  the  velocity  assumes  a  parabolic  prohle  eventually,  and  the  boundary 
layer  can  be  considered  to  extend  the  radius  of  the  pipe.  However,  in  the  case  of 
unsteady  flow,  and  in  the  case  where  the  solid  boundary  has  a  limited  extent  (a  hnite 
obstacle),  the  boundary  layer  can  not  grow,  and  it  remains  a  narrow  boundary  layer 
around  the  solid  obstacle  as  described  in  the  previous  paragraph. 

Under  appropriate  conditions,  such  as  fast  flow  around  sharp  corners,  the  bound¬ 
ary  layer  separates  from  the  region  where  it  is  formed,  and  begins  “to  take  a  life  of 
its  own”  as  it  moves  away  with  the  flow.  As  soon  as  it  separates,  the  boundary  layer 
turns  around  itself  and  forms  narrow  loops  of  turning  flow,  which  are  called  vortices. 
A  simple  way  to  understand  this  curling  up  is  that  the  different  sides  of  the  boundary 
layer  are  moving  with  different  speeds,  so  that  when  the  two  sides  are  suddenly  free, 
they  can  only  turn  into  themselves  and  curl  up.  Figure  2-2  shows  the  formation  of  a 
vortex  around  a  sharp  corner,  and  hgure  2-3  shows  the  formation  of  vortices  around 
a  jet  that  is  injected  at  high  speed  into  a  stationary  fluid. 

The  angular  speed  of  a  vortex  can  be  calculated  using  the  curl  of  the  fluid  velocity, 
which  is  called  the  vorticity.  In  three  dimensions  the  curl  of  the  velocity  is  a  3D  vector, 
while  in  two  dimensions  the  curl  of  the  velocity  is  simply  a  scalar  with  a  direction 
normal  to  the  plane  of  the  flow,  for  example  the  z-axis.  In  particular,  the  following 
formula  applies. 


(V  X  U),  = 


'dV,  dVr 


(2.61) 


dx  dy  J 

In  chapter  7,  contour  plots  of  the  above  scalar  vorticity  are  used  in  order  to  visualize 
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Figure  2-4:  A  boundary  layer  forms  above  a  flat  plate  that  is  stationary  with  respect 
to  a  fast-moving  flow. 

the  flow. 

It  is  worth  mentioning  that  a  common  approximation  in  fluid  mechanics  is  to 
assume  that  the  vorticity  is  zero  for  the  most  part  of  the  flow.  The  rationale  behind 
this  approximation  is  to  assume  that  the  viscosity  is  very  small  so  that  it  can  neglected 
for  the  most  part  of  the  flow  (inviscid  flow).  Also,  the  fluid  is  assumed  to  be  initially 
at  rest  so  that  the  vorticity  is  initially  zero.  Then,  according  to  Kelvin’s  circulation 
theorem  for  inviscid  flows,  the  integral  of  vorticity  remains  constant  in  time  over  any 
simply-connected  surface  of  the  flow  (Newman  [34,  p.l05]).  Physically,  this  means 
(Tritton  [54]  p.84  and  p.ll4)  that  if  the  vorticity  is  initially  zero,  it  will  always 
remain  zero.  Of  course,  the  viscosity  can  not  be  neglected  in  boundary  layers  where 
the  velocity  gradient  is  large.  Thus,  the  condition  of  zero  vorticity  is  always  an 
approximation  which  we  hope  is  valid  for  the  most  part  of  the  flow. 
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The  condition  of  zero  vorticity  enables  us  to  write  the  velocity  held  as  the  gradient 
of  a  scalar  potential  function.  For  example,  in  two  dimensions,  the  condition  of  zero 
vorticity  implies  that 


dV, 


dVr 


(2.62) 


dx  dy 

This  is  the  condition  of  integrability  for  an  exact  differential  (Courant  [13,  p.353]). 


dcf)  =  Vx  dx  -\-  Vy  dy 


(2.63) 


Therefore,  the  scalar  potential  function  ^  can  be  introduced  in  place  of  the  vector 
velocity.  The  scalar  potential  function  is  very  useful  because  it  enables  us  to  calcu¬ 
late  analytically  the  solutions  of  many  how  geometries  (see  chapter  4  of  Newman’s 
book  [34])  especially  in  two  dimensions. 

Because  of  its  versatility  in  hnding  analytic  solutions,  the  potential  approximation 
is  used  very  widely,  even  when  there  is  a  lot  of  vorticity  in  the  how,  and  the  inviscid 
assumption  is  very  questionable.  The  way  this  is  done  is  by  introducing  a  potential 
function  with  singularities  where  the  vorticity  is  zero  everywhere  except  at  a  few 
singular  points  called  point  vortices.  Using  such  techniques,  the  effect  of  boundary 
layers,  and  also  the  effect  of  unsteady  generation  of  vorticity  can  be  handled  within 
the  framework  of  a  potential  model  (chapter  4  of  Newman’s  book  [34]).  The  potential 
model  is  also  useful  in  situations  which  are  too  complex  to  analyze  otherwise,  and  the 
potential  model  provides  at  least  one  estimate  of  the  behavior  of  the  how.  Such  an 
example  is  the  how  near  the  edge  of  a  hue  pipe  (for  an  overview  see  Verge94  [56]  and 
Hirschberg94  [26]).  The  success  of  the  potential  model  in  these  situations  depends  on 
having  a  good  understanding  of  the  how  in  order  to  make  the  right  assumptions  and 
the  right  approximations. 

The  above  discussion  on  potential  how,  vortex  theory,  and  boundary  layers  is  not 
critical  for  the  computer  simulations,  but  it  is  useful  background  material.  All  of 
the  ideas  introduced  above  are  very  important  parts  of  huid  dynamics,  and  there  are 
entire  books  and  chapters  devoted  to  their  study  [34,  44,  3,  54]. 
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2.4.2  Bulk  viscosity 

The  n  term  of  the  Navier  Stokes  equations  2.53-2.55  is  called  bulk  viscosity,  and 
expresses  the  loss  of  momentum  during  elastic  compression  and  dilatation  of  fluid 
elements.  The  actual  value  of  the  //  coefficient  is  difficult  to  measure  experimentally, 
and  is  not  known  for  many  types  of  fluids  (Tritton  [54,  p.58]).  It  is  common  practice 
to  use  the  following  value, 

=  (2.64) 

which  is  called  the  Stokes’  relation  and  corresponds  to  setting  the  second  coefficient 
of  viscosity  equal  to  zero  {[  =  0  in  equation  2.51  (Peyret&Taylor  [38,  p.ll]  and 
Tritton  [54,  p.58]). 

In  the  case  of  subsonic  flow,  the  //  term  is  often  omitted  because  the  gradient  of  the 
divergence  of  velocity  is  very  small  compared  to  the  other  terms  of  the  momentum 
Navier  Stokes  equations  (see  below).  Accordingly,  in  the  computer  simulations  I 
omit  the  //  term  when  I  use  hnite  difference  methods.  However,  when  I  use  the 
lattice  Boltzmann  method,  I  employ  a  positive  //  term  which  comes  with  the  lattice 
Boltzmann  method  by  construction.  The  value  of  the  //  coefficient  for  the  lattice 
Boltzmann  method  (two-dimensional  orthogonal  model)  is  given  by  equation  4.47 
of  chapter  4,  and  depends  on  two  lattice  Boltzmann  parameter  Wq  and  j/o-  The 
parameter  j/o  is  usually  chosen  j/o  =  and  the  resulting  formula  for  //  is, 

/i  =  ly  [2  —  9rco)  (2.65) 

There  is  considerable  freedom  in  choosing  Wq  within  the  constraints  rco  >  0  and 
5  rco  +  -^0  =  1  where  Zq  >  0.  For  example,  the  value  Wq  =  5/27  produces  the  Stokes 
relation  //  =  |z/  (see  section  4.1.3  regarding  the  maximum  value  of  Wq  for  stability). 
In  my  simulations,  I  also  use  the  values  Wq  =  1/7  and  Wq  =  1/6  which  produce  slightly 
larger  values  of  //  than  the  Stokes  relation.  I  do  not  pay  much  attention  to  the  precise 
value  of  n  because  the  //  term  is  very  small  in  subsonic  flow. 

The  reason  why  the  //  term  is  very  small  in  subsonic  flow  compared  to  the  other 
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terms  of  the  momentum  Navier  Stokes  equations  is  as  follows.  The  continuity  equation 
shows  that  the  divergence  of  velocity  is  directly  proportional  to  changes  in  density. 
The  momentum  equation  shows  that  changes  in  density  are  proportional  to  changes 
in  fluid  momentum  after  multiplication  by  the  square  of  the  speed  of  sound.  Since 
the  speed  of  sound  is  much  larger  than  the  fluid  speed  in  subsonic  flow,  the  gradient 
of  the  divergence  of  velocity  (the  //  term)  is  expected  to  be  small  compared  to  the 
other  terms  in  the  momentum  equation. 

The  above  argument  can  be  made  precise  by  obtaining  an  estimate  for  the  gradient 
of  the  divergence  of  velocity  from  the  continuity  equation  2.53,  as  follows. 


d{V  ■  pV)  _  d{dpldt) 
dx  dx 


(2.66) 


The  above  estimate  of  the  divergence  of  velocity  (multiplied  by  p)  must  be  compared 
against  the  other  terms  of  the  momentum  equation.  Let  us  choose  the  pressure  term 
as  a  good  representative  of  the  size  of  the  terms  in  the  momentum  Navier  Stokes 
equations.  We  inquire  whether  the  following  inequality  is  true. 


d  I  dp 
^  dt  i  9a; 


<  ct 


dp 

dx 


(2.67) 


where  the  symbol  means  “very  small  compared  to”.  To  prove  this  inequality,  we 
can  estimate  that  the  time  derivative  of  dpj dx  can  not  be  larger  than  the  present  value 
of  dpj  dx  times  the  speed  of  sound  divided  by  some  wavelength  A  that  corresponds 
to  this  disturbance.  This  is  because  the  fastest  changes  in  subsonic  flow  propagate  at 
the  speed  of  sound.  Thus,  we  can  write. 


—  ( 1  ^ 
^  dt  \dx  j  ~  ^  \  X  )  \dx 


(2.68) 


From  the  above,  we  can  conclude  that  in  the  case  of  subsonic  flow  the  inequality  2.67 
is  equivalent  to  the  following  inequality. 


p 


A 


<  c 


2 

s 


(2.69) 
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or 

fj.  Cs  X  (2.70) 

In  the  case  of  air,  the  bulk  viscosity  coefficient  //  of  air  is  about  0.075  cm^/s  (using 
H  =  u/2),  and  the  speed  of  sound  is  =  34400  cm/s  (see  section  2.6).  The  smallest 
wavelength  (smallest  length  of  disturbance)  at  which  the  Navier  Stokes  equations  are 
applicable  is  about  A  =  10“^  cm  as  explained  in  section  2.1,  so  the  above  inequality 
is  well  satished.  Also,  the  wavelength  A  =  10“^  cm  corresponds  to  a  frequency 
/  =  Cs/A  =  34  MHz  which  is  well  beyond  the  range  of  acoustic  frequencies  that  we 
are  interested  in  the  case  of  musical  instruments,  namely  less  than  20  kHz.  Therefore, 
it  is  reasonable  to  ignore  the  //  term  in  the  computer  simulations  of  flue  pipes.  In 
section  2.5.2,  the  effect  of  bulk  viscosity  on  the  decay  of  acoustic  waves  is  calculated 
exactly,  and  is  found  to  be  extremely  small. 

2.4.3  Incompressible  flow  approximation 

This  section  describes  the  incompressible  flow  approximation  of  the  Navier  Stokes 
equations.  This  approximation  is  not  used  in  the  computer  simulations,  but  is  useful 
background  material. 

First,  a  word  on  terminology  is  in  order.  An  incompressible  flow  is  also  called 
“hydrodynamic  flow”  in  view  of  the  fact  that  the  compressibility  of  water  is  very 
small.  Of  course,  this  is  only  a  naming  convention,  and  does  not  imply  that  water 
is  perfectly  incompressible  which  is  false.  Further,  the  term  “hydrodynamic”  is  also 
used  to  distinguish  the  dynamics  of  a  flow  which  do  not  depend  on  compressible  effects 
(the  hydrodynamics)  from  the  dynamics  of  a  flow  which  do  depend  on  compressible 
effects  (the  acoustic  waves).  In  other  words,  the  name  “hydrodynamic”  is  general 
term,  and  is  somewhat  different  from  the  precise  notion  of  incompressibility  which  is 
the  subject  of  this  section. 

In  incompressible  flow,  the  continuity  equation  is  replaced  with  the  condition  that 
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the  divergence  of  the  velocity  is  zero, 


m 

dx  dy  dz 


(2.71) 


Also,  the  term  [cllp)dpldx  is  usually  written  as  dPjdx  so  that  the  density 
variable  does  not  appear  at  all  in  the  equations  of  fluid  flow.  Aside  from  this  change, 
the  momentum  equations  2.58-2.60  remain  unchanged  in  all  other  respects. 

The  rationale  behind  the  incompressible  flow  idea  is  to  assume  that  the  time 
derivative  and  the  spatial  variations  of  the  density  are  very  small  compared  to  the 
velocity  gradients.  Such  an  assumption  originates  from  the  fact  that  the  density 
gradient  dpj dx  is  proportional  to  the  derivatives  of  velocity  divided  by  the  square 
of  the  speed  of  sound,  (see  the  momentum  Navier  Stokes  equations).  Because  the 
ratio  Vj Cg  is  very  small  in  the  case  of  subsonic  flow,  the  density  gradient  is  very  small 
compared  to  the  derivatives  of  the  velocity. 

Physically,  the  condition  of  incompressibility  (zero  divergence  of  the  velocity)  im¬ 
plies  that  any  disturbances  of  the  fluid  propagate  with  inhnite  speed  to  other  parts 
of  the  fluid.  This  is  only  an  approximation  because  any  disturbances  of  a  real  com¬ 
pressible  fluid  propagate  with  a  hnite  speed  of  sound  to  other  parts  of  the  fluid  via 
acoustic  waves.  The  advantage  of  assuming  an  inhnite  speed  of  sound  propagation  is 
to  allow  us  to  solve  the  Navier  Stokes  equations  without  having  to  follow  the  prop¬ 
agation  of  acoustic  waves  step  by  step.  Thus,  the  numerical  solutions  of  the  Navier 
Stokes  equations  can  be  speeded  up,  and  the  theoretical  analysis  of  the  equations  can 
be  simplihed  in  many  cases  as  well. 

Another  way  of  understanding  the  incompressible  how  approximation  is  to  con¬ 
sider  the  following  situation  which  appears  paradoxical  at  hrst  sight.  A  solution  of 
the  incompressible  how  equations  is  steady  how  inside  long  pipes,  known  as  Hagen- 
Poiseuille  how  (Landau&Lifshitz  [32,  p.51]).  Let  us  consider  a  two-dimensional  pipe 
with  the  walls  located  at  j/  =  1  and  j/  =  0,  and  let  us  assume  a  constant  pressure 
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gradient  along  x, 

dP  C  do 

— —  =  —  —  =  constant  (2.72) 

ox  po  ox 

where  the  adiabatic  relation  between  pressure  and  density  is  used.  As  stated  earlier, 
the  density  gradient  is  extremely  small  because  the  speed  of  sound  is  extremely  large 

compared  to  the  flow  speed;  which  is  the  reason  why  we  can  neglect  the  density 

variations  from  the  continuity  equation  and  arrive  at  the  incompressible  flow  condition 
V  •  y  =  0.  The  steady  state  solution  of  the  above  problem  is  a  vanishing  vertical 
velocity  n  =  0,  and  a  parabolic  prohle  for  the  horizontal  velocity  n, 

By  substitution,  we  can  show  that  the  above  solution  satishes  the  momentum  Navier 
Stokes  equations,  the  condition  of  incompressibility,  and  the  boundary  conditions  of 
vanishing  velocity  at  the  walls.  However,  a  paradox  arises  when  we  try  to  substi¬ 
tute  the  above  solution  into  the  continuity  equation  2.57.  The  continuity  equation 
expresses  the  conservation  of  fluid,  and  it  must  be  satished  always  independent  of  the 
approximations  that  we  introduce.  In  expanded  form  we  have. 


dp  dp 

dt  dx 


dp  du 


0 


(2.74) 


All  the  terms  except  udpjdx  vanish  according  to  our  solution,  therefore  the  term 
udpjdx  must  vanish  also.  This  is  an  apparent  contradiction  because  we  know  that 
the  density  gradient  dpj dx  must  be  very  small,  but  not  identically  zero.  The  term 
udpjdx  expresses  the  change  of  density  which  is  caused  by  the  flow,  and  from  a 
physical  point  of  view  there  must  be  another  term  that  balances  this  change  of  density, 
no  matter  how  small  it  may  be.  The  question  is  “which  term  balances  the  change  of 
density?” 

One  way  of  resolving  the  paradox  is  to  add  a  correction  to  the  pdujdx  term 
in  order  to  balance  equation  2.74.  This  is  a  reasonable  assumption  in  view  of  the 
steadiness  and  symmetry  of  the  problem  along  the  y  direction.  Thus,  we  assume 
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that  the  horizontal  velocity  has  a  very  small  but  non- vanishing  variation  in  x,  and 
we  introduce  a  modihed  velocity  u  with  a  correction  e(a;), 


u  =  u  [1  -\- 


(2.75) 


The  continuity  equation  is  satished  to  hrst  order  in  e  (that  is,  the  error  is  of  order  e^) 
if  the  correction  e  is  as  follows, 

1  dp\ 


e[x]  =  — 


(2.76) 


Pq  dx  ) 

Thus,  we  have  found  that  the  horizontal  velocity  has  a  very  small  but  non- vanishing 
variation  in  x.  Our  original  solution  must  be  modihed  according  to  the  following 
formula, 

^  /  A  r)  n\  r 

(2.77) 


“(2/’^)  =  d  ?/(?/-  1) 


1- 

\p,dx) 


2  \v  po  dx ^ 

The  importance  of  the  above  correction  is  to  helps  us  understand  better  the  approx¬ 
imation  of  incompressible  how.  In  practice,  the  correction  is  not  very  useful  because 
it  is  exceedingly  small.  In  particular,  if  the  size  of  the  velocity  is  of  order  unity 
M  ~  1  cm/s,  then  the  normalized  density  gradient  il  j po)d p j dx  is  of  the  order  1/c^ 
which  is  about  10“®  in  air  at  room  temperature  and  atmospheric  pressure.  If  we  can 
measure  the  how  velocity  with  an  accuracy  of  10“^  (3  decimals),  then  the  above  cor¬ 
rection  to  the  velocity  becomes  noticeable  between  the  ends  of  a  pipe  that  is  10  km 
long.  This  is  of  course  unrealistic  because  other  effects  that  we  have  not  modeled 
here  become  important  in  a  10  km  pipe. 

The  above  analysis  of  Hagen-Poiseuille  how  in  a  pipe  is  an  example  where  the 
incompressible  how  approximation  works  very  well.  By  contrast,  the  how  of  air 
inside  a  wind  musical  instrument  is  an  example  where  the  acoustic  waves  interact 
with  the  hydrodynamics  of  the  how.  In  such  a  situation,  the  incompressible  how 
approximation  is  inapplicable,  and  the  compressible  Navier  Stokes  equations  must  be 
used  instead.  Below,  the  wave  equation  is  discussed  because  it  is  useful  background 
material  for  the  simulations  of  hue  pipes. 
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2.5  The  wave  equation 

In  this  section,  the  wave  equation  is  derived  from  the  compressible  Navier  Stokes 
equations,  and  further  some  interesting  solutions  of  the  wave  equation  are  described. 
An  acoustic  wave  in  a  compressible  medium  is  usually  dehned  as  an  oscillatory  mo¬ 
tion  of  small  amplitude  [32,  p.251].  The  oscillation  arises  through  the  interchange 
of  energy  between  the  kinetic  and  the  potential  forms;  namely,  the  velocity  and  the 
density  (pressure)  of  the  compressible  medium.  By  means  of  this  oscillatory  mecha¬ 
nism,  any  disturbance  of  the  density  and/or  the  velocity  of  the  compressible  medium 
propagates  inside  the  medium  and  reflects  off  boundaries.  The  speed  of  propagation 
is  characteristic  of  the  medium,  and  it  is  called  the  speed  of  sound. 


2.5.1  Linear  inviscid 


In  order  to  derive  the  wave  equation  from  the  Navier  Stokes  equations,  we  consider 
the  simplest  situation  from  an  acoustic  point  of  view.  Namely,  we  assume  that  the 
mean  flow  is  zero,  and  that  the  amplitude  of  the  acoustic  disturbance  is  small.  Math¬ 
ematically,  this  means  that  the  fluid  velocity  and  density  can  be  written  as  follows. 


P  =  Po  +  p'  Po  constant,  p'  <  po 

V  =  \7>  +  V'  Vo  =  0,  V'  small 


(2.78) 


We  also  consider  the  compressible  Navier  Stokes  equations  in  their  original  form. 


^  +  (B  •  Vp)  +  p(V  ■V)  =  0 


dv^ 

dt 


+  hfc 


dv^ 

dxh 


+ 


dp 


p  dxj 


-  nVVj  -  p 


d{V  •  V) 

dx, 


=  0 


(2.79) 


(2.80) 


where  j  =  1,2,3  stands  for  the  Cartesian  directions  x^y^z.  If  we  substitute  the 
density  and  the  velocity  given  by  equation  2.78  in  the  above  Navier  Stokes  equations, 
and  neglect  small  terms  that  are  quadratic  in  the  acoustic  amplitude,  we  obtain. 


%  +  PoN-V'  =  0 
dt 


(2.81) 
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dvi 

dt 


+ 


^di_ 

Po  dxj 


0 


(2.82) 


In  the  above  calculation,  the  quadratic  terms  iV'  ■  V/?')  and  VI  dV-  j dxk  are  omitted 
from  the  continuity  and  the  momentum  equations  respectively.  Also,  the  approxi¬ 
mation  (po  +  p')  ~  Po  is  used  whenever  the  density  p  appears  as  a  multiplicative 
factor.  Further,  the  viscosity  terms  are  omitted  because  they  have  a  negligible  effect 
on  the  acoustic  waves,  as  we  shall  see  in  section  2.5.2.  These  may  seem  a  lot  of 
simplihcations,  but  they  are  very  reasonable  for  sound  waves  of  small  amplitude  in 
air. 


To  proceed  further,  we  try  to  obtain  an  equation  involving  the  density  only.  We 
differentiate  the  continuity  equation  2.81  with  respect  to  time,  and  the  momentum 
equation  2.82  with  respect  to  the  spatial  direction  x^  in  order  to  eliminate  the  velocity 
from  the  density  equation.  In  two  dimensions  we  have  three  equations.  We  use  the 
notation  u  =  V\  and  v  =  V2  for  the  x,  y  components  of  the  velocity. 


d^p'  (  d^u  d^v  \ 
dC  \dxdt  dydt J 


d^u  cl  cEp' 

+  — =  0 


(2.83) 

(2.84) 

dydt  '  Po  dy^  ^  ^ 

By  subtracting  the  above  momentum  equations  from  the  continuity  equation,  we 
obtain  a  linear  wave  equation  for  the  acoustic  density. 


dxdt 

d^v 


+ 


Po  dx"^ 

d^p' 


=  0 


d^p' 


—  c„ 


'd^p'  ^  d^p'' 


=  0 


(2.86) 


dC  ^  \dx^  dy\ 

A  complementary  approach  is  to  try  to  obtain  an  equation  involving  the  acoustic 
velocity  only.  To  do  so,  we  differentiate  the  continuity  equation  2.81  with  respect 
to  X  and  y,  and  the  momentum  equation  2.82  with  respect  to  time.  Then,  we  can 
eliminate  the  density  from  the  velocity  equations  to  obtain. 


(Vyn  ^ 

dc  *  dx 


(2.87) 
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dc  *  dy 

or  equivalently, 

d^u  2  f  \ 

dC  ®  dxdy J 


(2.88) 


(2.89) 


d^v  2  f 

^  [dxdy 


(2.90) 


The  above  appear  to  be  coupled  equations  in  x,  y,  however,  the  fact  that  we  have 


omitted  viscosity  from  our  acoustic  model  (see  equation  2.82)  can  actually  decouple 
the  above  equations.  By  calculating  the  curl  of  equation  2.82,  we  can  show  that  the 


vorticity  of  the  acoustic  flow  is  constant,  as  follows. 


d  ( dv  du 
dt  ySx  dy 


^  fd^  _  d^\  ^  ^ 

po  \dxdy  dydx j 


(2.91) 


Furthermore,  a  reasonable  assumption  for  acoustic  waves  is  that  the  acoustic  motion 
starts  from  an  initial  state  of  rest,  so  that  the  vorticity  is  initially  zero.  The  above 
equation  shows  that  the  vorticity  remains  always  zero.  Thus,  the  following  condition 
of  integrability  is  satished  (Courant  [13,  p.353]). 


dcf)  =  udx  +  V  dy 


(2.92) 


and  the  acoustic  velocity  is  the  gradient  of  a  scalar  potential  function,  denoted  here  by 
(f).  The  above  acoustic  potential  is  a  special  case  of  the  general  potential  model  (zero 
vorticity  approximation)  that  is  discussed  in  section  2.4.1.  We  have  the  relations. 


d(f)ldx 
d(f)ldy 
d^uj(dxdy) 
d^v  j  (dxdy) 


u 


V 

d^(f)/(dx'^dy)  =  d^v/dy^ 
d^(l)/(dxdy^  =  d^u/dy^ 


(2.93) 


By  substituting  the  above  into  equations  2.89,  2.90,  we  obtain  the  linear  wave  equation 
for  each  component  of  the  velocity. 


d^u  2  ( 

^  tv  j  ^  ° 


(2.94) 
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d^v  2  f  9^'^ 
dC  ®  dy^ ) 


(2.95) 


By  substituting  the  potential  ^  in  equation  2.82,  and  integrating  along  Xj  as  follows, 


dtdxj 


+ 


dp' 


Po  dxj 


dxj  =  0 


(2.96) 


we  obtain 

'T  +  C(‘)  =  -  ();)  p'  (2'W) 

where  C{t)  is  an  integration  constant  that  can  only  depend  on  time.  Therefore, 
it  can  be  absorbed  in  the  velocity  potential  without  affecting  the  spatial  gradient. 
For  example,  we  can  redehne  the  potential  as  follows  (for  a  related  problem  see 
Newman  [34,  p.l08]), 

({)' =  (p  +  I  C{T)dT  (2.98) 


but  we  keep  the  same  symbol  cp  below  for  simplicity.  Thus,  we  obtain  a  simple  relation 
between  the  acoustic  density  and  the  acoustic  potential. 


P  = 


cl  I  dt 


(2.99) 


and  also  a  linear  wave  equation  for  the  potential 


d^ 

W 


—  c„ 


'd^  d^" 

dx'^  dy'^ 


=  0 


(2.100) 


A  typical  solution  of  the  above  linear  wave  equation  is  a  plane  wave  traveling  along 
the  positive  x  direction, 

(p{t,x,y)  =  f{x  -  ct) 

u{x,yO)  =  f  {x  -  ct)  (2.101) 

O 

p'{t,x,y)  =  (po/c)  f  {x  -  ct) 

O 

where  f{x)  is  an  arbitrary  differentiable  function  of  one  variable,  and  /  (x)  denotes 
its  hrst  derivative.  The  negative-traveling  wave  f[x  ct)  is  also  a  solution.  Because 
the  wave  equation  is  linear,  any  superposition  of  solutions  is  also  a  solution.  In 
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particular,  the  complex  exponentials  that  satisfy  the  wave  equation  can  be  used  to 
represent  almost  any  solution  as  a  sum  (integral)  of  complex  exponentials  according 
to  Fourier’s  decomposition  theorem  (Courant  [13,  p.318]).  The  complex  exponentials 
that  satisfy  the  wave  equation  are  as  follows, 

u{x,yO)  =  (2.102) 


The  above  complex  exponential  is  a  periodic  traveling  wave  that  advances  in  the 
positive  X  direction  with  increasing  time.  The  speed  of  propagation  is  the  speed  of 
sound  Cs,  and  we  have  the  following  relations. 


=  /A 

(2.103) 

2  TT 

~k 

(2.104) 

2vr/ 

(2.105) 

where  k  is  the  spatial  frequency,  A  is  the  wavelength,  /  is  the  time  frequency  in  cycles 
per  second  {Hz),  and  u  is  the  time  frequency  in  radians  per  second.  Because  of  the 
linearity  of  the  wave  equation,  it  is  valid  to  calculate  with  complex  exponentials  which 
are  more  convenient  than  sines  and  cosines,  as  long  as  we  are  consistent  in  taking 
the  real  part  (or  the  imaginary  part)  of  our  expressions  at  the  beginning  and  the  end 
of  the  calculation.  For  example,  the  complex  exponential  solution  of  equation  2.102 
“contains”  the  following  two  physical  solutions. 


u{x,y  ,t)  =  cos  {k  X  —iot) 
u{x,y,t)  =  sin{kx  —ut) 


(2.106) 


depending  on  whether  we  choose  the  real  or  the  imaginary  part. 

Apart  from  traveling  waves,  there  are  also  stationary  waves  which  means  that  the 
time  variation  is  decoupled  from  the  spatial  variation.  Stationary  waves  arise  when 
we  impose  hxed  boundary  conditions  such  as  two  walls  at  which  the  acoustic  velocity 
must  vanish.  The  simplest  way  to  construct  a  stationary  wave  is  to  combine  two 
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periodic  traveling  waves  that  are  identical  except  for  traveling  in  opposite  directions. 
For  example, 


1 


i[k  X  -\-io  t) 


-  cos(cut)  (2.107) 

which  corresponds  to  the  following  two  stationary  waves, 

u(x,  t)  =  cos  [k  x)  cos(a;  t) 
u[xO)  =  sin(A;a;)  cos(a;f) 

A  stationary  wave  possesses  nodes  and  loops  that  are  hxed  in  space.  A  node  is  a  point 
where  the  amplitude  vanishes,  while  a  loop  is  a  point  where  the  amplitude  achieves 
maximum  values  during  one  period  of  oscillation.  In  the  case  of  stationary  waves, 
the  velocity  nodes  are  density  loops,  and  the  density  nodes  are  velocity  loops.  To 
see  this,  we  calculate  the  density  that  corresponds  to  the  velocity  of  equation  2.108 
by  differentiating  in  space  and  integrating  in  time  as  prescribed  by  the  continuity 
equation  2.81.  We  obtain, 

p'[xO)  =  —po[klijj)  sin(A;a;)  sin(a;f) 
p'[xO)  =  po[klijj)  cos[kx)  sin(a;f) 

We  see  that  the  loops  and  nodes  are  interchanged  between  density  and  velocity  in 
the  case  of  stationary  waves.  This  is  in  contrast  to  free  sinusoidal  traveling  wave 
(equation  2.101)  where  the  loops  and  nodes  occur  at  the  same  locations  for  the  velocity 
and  the  density,  and  furthermore  the  loops  and  nodes  are  moving  with  time. 

Finally,  it  should  be  noted  that  the  solutions  of  the  wave  equation  2.82  for  a 
compressible  medium  such  as  air  are  longitudinal  waves  in  the  sense  that  the  acoustic 
velocity  oscillates  along  the  same  direction  as  the  direction  of  wave  propagation.  By 
contrast,  the  sound  waves  of  a  violin  string  are  transversal  oscillations  in  the  sense  that 
the  acoustic  motion  is  at  right  angles  to  the  direction  of  propagation  along  the  string. 
We  can  examine  mathematically  the  longitudinal  character  of  the  wave  equation  2.82 
by  trying  to  hnd  a  transversal  solution  as  follows, 

u{x,yO)  =  u{yO) 
v[x^yO)  =  v[xO) 


(2.108) 


(2.109) 


(2.110) 
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Then,  the  divergence  of  velocity  is  identically  zero.  The  x  momentum  equation  2.82 
implies  that  the  density  must  be  constant  with  x  because  the  velocity  u[yO)  only 
varies  with  y.  Similarly,  the  y  momentum  equation  2.82  implies  that  the  density 
must  be  constant  with  y  because  the  velocity  v[xO)  only  varies  with  x.  The  inte¬ 
gration  of  the  continuity  equation  2.81  says  that  the  density  p'  must  be  constant  in 
time.  Consequently,  there  can  be  no  wave  motion  at  all.  In  fact,  if  we  integrate 
equation  2.82,  we  obtain 

u{yO^)  =  -cl{dp'/dx)t  +  u{y,0)  (2  111) 

v{xO)  =  —cl{dp'ldy)t  +  n(a;,0) 

which  says  that  the  velocity  becomes  inhnite  with  time.  In  other  words,  there  are 
no  physically  relevant  solutions  in  this  case.  We  note  that  if  we  include  viscosity 
in  the  wave  equation,  then  we  can  obtain  transverse  oscillations  of  velocity  that  are 
physically  relevant.  We  will  examine  these  in  the  next  section.  However,  the  density 
of  these  transverse  waves  is  also  constant  in  time,  so  that  the  transverse  waves  can 
not  be  considered  to  be  sound  waves. 

Having  introduced  the  linear  wave  equation  and  some  of  its  solutions,  we  discuss 
in  the  next  section  the  solutions  of  a  modihed  wave  equation  that  includes  the  effects 
of  viscosity.  The  modihed  equation  of  the  next  section  is  still  linear,  and  thus  it  is 
straightforward  to  solve  analytically. 

2.5.2  Viscous  decay  of  sound 

In  this  section,  the  effect  of  viscosity  on  sound  waves  is  calculated  exactly  in  the 
special  case  of  one-dimensional  plane  waves.  ^  To  this  end,  we  retain  the  viscous 
terms  of  the  Navier  Stokes  equations  that  we  omitted  earlier  in  equation  2.82.  In 

^This  problem  is  also  discussed  in  a  slightly  different  way  in  Rayleigh  [42,  p.317],  Lamb  [31, 
p.647],  and  Morse&Ingard  [33,  p.285]. 
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place  of  equations  2.81,  2.82  we  have  the  following, 


dp'  I  du  dv 

dt  dy 


0 


(2.112) 


du  cl  dp'  ( d^u  d^u\  ( d^u  d^v  \ 

dt  pq  dx  dy'^j  ^  dydx  J 

dv  cl  dp'  f  d^v  d^v\  f  d^u  d^v\ 

dt  pq  dy  dy"^)  ^  \dxdy  dy"^) 

In  the  above  equations  we  retain  both  the  shear  and  the  bulk  viscosity  terms  because 
they  have  comparable  size  in  the  case  of  acoustic  waves.  We  also  inquire  whether 
we  should  retain  the  nonlinear  advection  terms  [u  du  j  dx)]  however,  these  terms  are 
smaller  than  the  viscous  terms  as  we  shall  see  in  section  2.5.4,  when  the  amplitude 
of  the  sound  wave  is  sufficiently  small,  which  we  assume  to  be  so. 

We  look  for  a  plane  wave  solution  with  n  =  0  and  u  =  u[xO)  ,  so  that  the 
equations  simplify  as  follows 


=  0  (2.113) 

=  0  (2.114) 


dp' 

~dt 


+  Po 


du\ 


(2.115) 


du 

~di 


+ 


c^d_l 

Po  dx 


{v  +  p) 


d^u 

dx'^ 


0 


By  differentiating  and  combining  the  above  equations,  we  obtain 


(2.116) 


d'^u  /  2  ~  ^  ^ 


(2.117) 


where  we  denote  h 


(u  +  p)  for  brevity.  We  look  for  general  solutions  of  the  form. 


u 


(k  X  —  Lv  t)  —  at 


(2.118) 


where  are  real  numbers,  and  they  correspond  to  a  spatial  frequency,  a  time 

frequency,  and  a  time  constant  of  exponential  decay.  By  substituting  the  above  into 
the  wave  equation  2.117,  we  obtain. 


(iuj  aY  =  —k^  (^cl  —  u  (iuj  a)^ 


(2.119) 
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which  can  solved  exactly  to  give, 


a  = 


(2.120) 


LO 

k  ^ 


\ 


1  + 


4c? 


(2.121) 


Therefore,  we  have  obtained  a  periodic  traveling  wave  solution  that  decays  with  a 
time  constant  a  given  by  equation  2.120.  The  density  function  that  corresponds  to 
the  velocity  of  equation  2.118  can  be  calculated  using  the  continuity  equation,  and  it 
is  as  follows, 

p'  =  (2.122) 

By  superposing  two  periodic  traveling  solutions  that  travel  in  opposite  directions  (the 
opposite  traveling  solution  can  be  obtained  by  negating  the  time  frequency  io  in  the 
above  equations),  we  obtain  a  stationary  solution. 


u  = 


(l/2)e-“*  (^Cikx-ujt)  ^  ^i{kx  +  ujt)'^  ^  cosiiot)  (2.123) 


P  = 


k  Poe 


—  at 


ike 


[i  a  cos[ijjt)  —  i  id  sin(a;f)) 


(2.124) 


By  taking  the  imaginary  part  of  the  above  expressions,  we  can  obtain  the  following 
stationary  solution  in  real  numbers. 


/ 

P 


M  =  e  sin  A;  x  cos{ijjt) 


k  Poe  "  * 


—uj  cos[k  x)  sin(a;A)  +  a  cos[k  x)  sin(a;A)) 


(2.125) 

(2.126) 


Returning  now  to  equation  2.120  for  the  decay  constant,  we  can  see  that  large  spa¬ 
tial  frequencies  (short  wavelength)  decay  faster  than  small  spatial  frequencies  (long 
wavelength).  However,  the  decay  is  extremely  slow  even  for  very  large  spatial  frequen¬ 
cies.  To  see  this,  we  use  the  values  =  34400  cm/s  and  h  =  (n  +  .5n)  =  0.225  cm^/s, 
and  we  write 


(jj 


k 


Cs  a/1  +  e 


(2.127) 
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The  correction  e  is  very  small  for  all  frequencies  of  interest, 


e  = 


4c? 


=  r  X  1.069  xlO 


-11 


(2.128) 


Therefore, 


(jj 

k 


r\j 


Cs 


(2.129) 


In  particular,  e  is  equal  to  1/100  when  k  is  about  k  =  3  X  10“^.  The  correction  e 
decreases  quadratically  with  smaller  frequencies,  so  we  can  safely  approximate  io  = 
Cs  k  for  all  spatial  frequencies  up  to  A;  =  3  xlO'^.  Furthermore,  the  frequency  k  =  3  xlO'^ 
is  larger  than  the  maximum  frequency  k  =  6.28  X  10^  at  which  the  Navier  Stokes 
equations  are  applicable  (see  section  2.1).  Therefore,  the  relation  u  =  Cgk  is  valid  for 
all  frequencies  of  interest. 


We  can  calculate  how  many  cycles,  denoted  by  W,  it  takes  for  a  sinusoidal  acoustic 
wave  to  decay  to  one-tenth  of  its  original  value  by  setting. 


g-aiV27r/cu  ^ 


(2.130) 


By  combining  the  above  relation  with  equation  2.120,  and  the  relation  k  =  cu/c^,  we 
obtain  a  relation  between  the  frequency  of  the  acoustic  wave  and  the  number  of  cycles 
N  it  takes  for  the  wave  to  decay  to  one-tenth  of  its  initial  value. 

6.135  xlO® 


r  _  CL  -  1.  ^ 

27r  27r  I  2'kN  ]  v 


N 


(2.131) 


For  example,  a  frequency  of  1  kHz  takes  613500  cycles  to  decay  to  one-tenth  of  its 
value.  The  duration  of  this  decay  is  about  613.5  s  and  corresponds  to  about  210  km 
for  a  traveling  wave.  Viscous  effects  are  more  pronounced  at  higher  frequencies,  as  we 
can  see  from  the  following  table  (we  are  considering  air  under  conditions  of  standard 
temperature  and  pressure). 

Very  high  frequencies  (above  1  MHz)  decay  very  quickly  in  contrast  to  frequencies  in 
the  low  range  less  than  20  kHz  which  decay  extremely  slowly. 
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/ 

A 

N 

time 

distance 

1  kHz 

34  cm 

613500 

613.5  s 

210  km 

100  kHz 

0.34  cm 

6135 

0.61  s 

21  m 

1  MHz 

0.034  cm 

613 

0.006  s 

21  cm 

10  MHz 

0.0034  cm 

61 

0.00006  s 

0.21  cm 

Table  2.1:  Viscous  decay  of  acoustic  waves  in  free  space. 


It  should  be  noted  that  an  alternative  way  of  obtaining  the  above  result  is  to  look 
for  a  solution  which  decays  with  distance  instead  of  time.  To  hnd  a  solution  that 
decays  with  distance,  we  expect  that  the  decaying  exponential  in  time, 

(2.132) 

should  be  replaced  by  a  decaying  exponential  in  space, 

a 

—  —  X 

e  C.S  (2.133) 


based  on  the  relation  kx  =  cot  for  the  propagation  of  a  traveling  wave.  In  fact,  if 
we  substitute  a  trial  solution  of  the  form  (a  decaying  wave  that  travels  to  the  right 

X  >  0), 

u  =  Cikx  -  ut)  -  ^x  (2.134) 

into  the  linear  dissipative  wave  equation  2.117,  we  can  show  that  /3  is  equal  to  a/c^ 
as  expected, 

kid  k  k^  V  a 

2  cl  2  c,  c. 

Although  the  decay  in  space  is  similar  to  the  decay  in  time,  the  two  solutions  differ 
in  some  ways.  Whereas  the  time  solution  corresponds  to  a  free  traveling  wave  or  a 
standing  wave,  the  space  solution  corresponds  to  a  wave  emanating  from  an  oscillating 
boundary  condition,  and  it  expresses  the  viscous  decay  of  sound  with  distance  from 
the  source.  The  algebra  of  the  two  solutions  is  different  also.  For  the  solution  in  space 
we  have  the  equation. 


—  cu^  =  Cg(/3^  —  k'^)  —  2i  (I  k  cl  +  iv  to  {k'^  —  fl'^)  —  21  to  fl  k 


(2.136) 
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Equating  the  imaginary  parts,  we  obtain  a  quadratic  equation  for  /3,  and  we  choose 
the  decaying  solution  for  positive  x  >  0  versus  an  unphysically-growing  solution, 

—  kC  r  / - ^ - TTTrl  kid  k  a 


/3  = 


p  iO 


1  -  ^1  +  (kiolclY 


2C 


(2.137) 


Equating  the  real  parts,  and  using  the  above  approximate  /3,  we  hnd  io  in  terms  of  A;, 

(2.138) 


(jj 

I  = 


\ 


1  +  3 


id 


~  c. 


Thus,  we  have  a  solution  valid  for  x  >  0, 

^  ^  ^t{kx  -  Lot)  -  fix 


,  k  i  (I  i(kx  —  cot)  —  0  X 

P  =  - Poe  ^  ^ 

to 


(2.139) 

(2.140) 

The  above  solution  decays  very  slowly  with  distance  x  for  frequencies  less  than 
100  kHz,  as  discussed  previously. 

In  the  case  of  musical  instruments  (acoustic  frequencies  less  than  20  kHz),  the 
above  decaying  solutions  play  a  very  small  role.  In  particular,  there  are  other  effects 
such  as  the  expansion  of  a  wave  in  space  (1/r^  in  three-dimensional  space)  which  can 
reduce  the  power  of  a  traveling  wave  much  sooner  than  the  viscous  decay  considered 
above.  In  the  case  of  waves  enclosed  within  pipes,  the  dominant  mechanisms  of  loss 
of  acoustic  energy  are  the  exchange  of  heat  with  the  walls,  and  also  the  transverse 
viscous  forces  as  opposed  to  the  longitudinal  viscous  forces  that  we  have  consid¬ 
ered  above.  We  will  consider  a  simple  example  of  transverse  viscous  forces  below. 
The  effect  of  heat  transfer  with  the  walls  is  ignored  by  the  adiabatic  model  of  sound 
which  assumes  no  heat  transfer.  Thermal  effects  are  discussed  in  Kittel&Kroemer  [28, 
p.434].  Transverse  friction  in  combination  with  thermal  effects  are  discussed  in  Lan- 
dau&Lifshitz  [32,  p.301],  and  also  Morse&Ingard  [33,  p.286]. 


2.5.3  Shear  waves 

This  section  analyzes  shear  waves  which  are  transverse  waves  as  opposed  to  the  lon¬ 
gitudinal  waves  of  the  previous  section.  These  shear  waves  are  another  solution  of 
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the  dissipative  wave  equations  2.112-2.114.  They  are  not  proper  acoustic  waves  how¬ 
ever,  because  they  do  not  involve  any  oscillations  in  density.  In  addition,  we  will  see 
that  they  are  solutions  of  the  incompressible  Navier  Stokes  equations  as  well,  without 
any  assumptions  of  linear  acoustics  such  as  small  velocity  amplitude.  Physically,  the 
shear  waves  correspond  to  the  flow  that  arises  when  a  rigid  plate  performs  tangential 
oscillations  along  its  own  plane,  and  the  fluid  above  the  plate  follows  the  oscillations 
because  of  shear  viscous  forces.  To  obtain  the  shear  waves  mathematically,  we  look 
for  solutions  of  the  form 

u{yO)  and  u  =  0  (2.141) 


If  we  substitute  the  above  expressions  in  the  dissipative  wave  equations  2.112-2.114, 
we  obtain 

r)n' 

(2.142) 
dp' 


du  d^u 

dt  dy"^  po  dx 


(2.143) 


dy 


(2.144) 

Immediately,  we  conclude  that  p'  does  not  vary  with  y  and  t.  Further,  since  we 
are  looking  for  a  velocity  u[yO)  that  does  not  vary  with  x,  equation  2.143  implies 
that  dp' ! dx  is  a  constant.  To  be  careful,  we  should  actually  consider  the  possibility 
that  there  are  velocity  variations  in  x  which  are  extremely  small  but  non- vanishing 
(see  section  2.4.3  on  the  paradox  of  incompressible  flow).  Then,  according  to  equa¬ 
tion  2.143  the  variations  of  the  density  gradient  dp'  j dx  must  be  even  smaller  than  the 
velocity  variations  by  a  factor  of  1/c^.  Thus,  we  can  safely  conclude  that  the  density 
gradient  dp'  j  dx  is  constant  based  on  equation  2.143. 

An  alternative  way  of  deriving  equation  2.143  is  to  consider  the  incompress¬ 
ible  Navier  Stokes  equations  discussed  in  section  2.4.3  instead  of  the  acoustic  equa¬ 
tions  2.112-2.114.  If  we  assume  a  velocity  of  the  form  u[yO)  and  u  =  0,  the  diver¬ 
gence  of  the  velocity  becomes  zero,  so  that  incompressibility  is  satished.  Then,  a 
substitution  in  the  momentum  Navier  Stokes  equations  2.58-2.60  produces  the  same 
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equation  2.143  that  we  obtained  above  in  the  acoustic  approximation  except  that  now 
we  do  not  require  linear  acoustics  that  the  velocity  amplitude  is  small. 

To  proceed  further  and  to  solve  equation  2.143,  we  observe  that  the  equation  is 
linear  so  that  different  solutions  can  be  superimposed.  One  solution  follows  by  letting 
the  time  derivative  of  the  velocity  be  zero;  that  is,  by  assuming  steady  flow.  Then, 
we  obtain  the  Hagen-Poiseuille  flow  that  we  discussed  earlier  in  section  2.4.3,  and  has 
the  form 

u{yO^)  =  7^  +  By  (2.145) 

2  A  \v  po  ox  J 

for  arbitrary  constants  A  and  B  that  can  be  used  to  satisfy  boundary  conditions.  This 
solution  can  be  superimposed  with  the  shear  wave  solution  that  we  obtain  immediately 
below. 

The  shear  wave  solution  can  be  obtained  by  setting  the  density  gradient,  which 
is  constant  as  we  argued  above,  equal  to  zero,  and  by  substituting  a  trial  solution  of 
the  form 

u{yo)  =  -  t^y  (2.146) 

We  hnd, 

-  luj  =  v{p^  -  -  i2 13k)  (2.147) 

which  can  be  solved  exactly  to  give. 


/3 


LO 

2knu 


and  /3 


k 


(2.148) 


u{yO) 


(2.149) 


If  we  impose  the  boundary  condition  that  there  is  a  solid  wall  at  j/  =  0  which  is 
oscillating  at  a  frequency  of  io  radians  per  second  uniformly  along  its  own  plane  (the 
x-axis),  then  the  above  expression  becomes  a  simple  shear  wave. 
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The  above  shear  wave  decays  very  fast  with  increasing  distance  from  the  oscillating 
boundary.  In  particular,  the  penetration  depth  at  which  the  amplitude  decreases  by 
a  factor  of  10  is  given  by, 

S  =  (In  10)  J—  (2.151) 

V  iO 

For  a  frequency  of  1  kHz  in  air  at  room  temperature  and  atmospheric  pressure  the 
penetration  depth  is  about  1.6  X  10“^  cm  which  is  a  very  small  distance. 

The  above  shear  waves  provide  an  estimate  of  the  viscous  boundary  layer  of  acous¬ 
tic  waves  that  are  traveling  along  the  length  of  a  pipe.  An  plane  wave  inside  a  hori¬ 
zontal  pipe  oscillates  back  and  forth  along  the  x  direction  and  creates  friction  against 
the  walls  in  an  analogous  way  to  the  oscillating  shear  waves  above.  Of  course,  there 
is  one  difference  that  the  acoustic  waves  oscillate  sinusoidally  in  x  as  opposed  to 
uniformly  in  x  that  we  considered  above  for  an  oscillating  wall.  Nevertheless,  the 
penetration  depth  that  we  calculated  above  provides  an  approximate  estimate  of  the 
viscous  boundary  layer  of  acoustic  waves.  It  also  shows  that  the  effects  of  shear 
friction  are  much  more  pronounced  than  the  effects  of  longitudinal  friction  that  we 
considered  in  the  previous  section  because  the  shear  wave  decays  to  zero  within  a 
very  short  distance  in  contrast  to  the  longitudinal  decay. 

Another  application  of  the  shear  wave  solution  that  we  obtained  above  is  the 
testing  of  numerical  methods,  as  we  will  see  in  section  4.5.  In  particular,  for  the 
purpose  of  numerical  testing  it  is  convenient  to  impose  two  boundary  conditions:  a 
non-moving  wall  at  j/  =  0,  and  an  oscillating  plate  at  y  =  1.  We  can  satisfy  these 
boundary  conditions  (Landau&Lifshitz  [32,  p.45])  if  we  constrain  the  general  shear 
wave  solution  given  by  equation  2.149  to  be  of  the  form. 


By  expanding  the  sines  of  imaginary  quantities  in  terms  of  hyperbolic  sines  and 
cosines,  and  performing  some  algebra,  we  can  obtain  a  real  solution  for  the  problem 
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of  an  oscillating  plate  above  a  non-moving  plate.  The  solution  is  presented  and  is 
used  for  numerical  testing  purposes  in  section  4.5. 

In  the  next  section,  the  relative  size  of  the  acoustic  terms  is  examined  when  an 
acoustic  wave  is  substituted  into  the  Navier  Stokes  equations.  This  will  conhrm  some 
of  the  discussions  in  the  previous  sections,  for  example  the  small  effects  of  viscosity  on 
acoustic  waves,  and  it  will  also  point  to  the  limitations  of  the  linear  acoustic  theory. 


2.5.4  Relative  size  of  acoustic  terms 


In  order  to  estimate  the  relative  size  of  the  acoustic  terms  in  a  complete  wave  equation 
that  includes  both  nonlinear  advective  terms  and  viscous  terms,  we  consider  a  one¬ 
dimensional  Navier  Stokes  equation  for  the  velocity  (Morse&Ingard  [33,  p.862]). 


du  du  (?  dp 
dt  ^  dx^  po  dx 


d^u 

dx'^ 


0 


(2.153) 


We  substitute  a  typical  sinusoidal  wave  in  the  above  equation  where  Uq  is  the  wave 
amplitude, 

u{x,t)  = 

p'{x,t)  =  [po/c]  Uo 

We  can  estimate  the  relative  size  of  the  terms,  as  follows,  where  we  normalize  against 
the  size  of  the  hrst  term. 


(du/dt)  {udujdx)  {{pq/ c)  dp' j  dx)  {v  d^uj  dx'^) 

uqoj  UqI  uokc  uok^v  (2.155) 

1  uo/c  1  k^  vie 


If  we  compare  the  nonlinear  advective  term  and  the  viscous  term  above,  we  obtain 
the  result. 


udu/dx 


Uq  k 


Uq 

k  V 


/n  2  -  ^1.2-1.^  (2-156) 

vo^ujclx^  UqV  kv 

Therefore,  the  viscous  decay  of  longitudinal  waves  that  we  calculated  in  section  2.5.2 
makes  sense  when  the  amplitude  Uq  of  the  wave  is  signihcantly  less  than  k  v.  For 


CHAPTER  2.  THE  MOTION  OE  EL UIDS 


88 


example,  we  have  the  following  numbers  for  h  =  0.225  cm^/s, 

/  A  Uq  =  k  V  pressure  level 

1  kHz  34  cm  0.04  cm/s  78  dB  (2.157) 

100  kHz  0.34  cm  4.2  cm/s  118  dB 

We  calculate  the  pressure  level  in  decibels  using  the  relation  (see  section  2.6), 

pressure  level  =  74  +  20  log;^Q(Cg  p')  (2.158) 

We  also  use  the  relation  u  =  Cg  p' j po  for  a  free  traveling  wave.  The  above  numbers 
indicate  that  if  the  frequency  is  1  kHz,  we  can  neglect  the  nonlinear  advective  terms 
in  comparison  to  the  viscous  effects,  if  the  wave  amplitude  is  signihcantly  less  than 
0.04  cm/s.  A  factor  of  10  would  require  that  the  pressure  level  is  less  than  58  dH, 
which  is  a  very  weak  sound.  Of  course,  the  viscous  effects  increase  with  higher 
frequency,  so  that  the  viscous  solution  of  section  2.5.2  applies  to  a  wider  range  of 
sounds  when  the  frequency  is  high. 

Returning  to  equation  2.155  we  can  see  that  both  the  viscous  terms  and  the 
nonlinear  advective  terms  are  smaller  by  a  factor  of  Uq/c  compared  to  the  remaining 
terms,  such  as  the  time  derivative  of  velocity.  This  is  the  reason  why  we  can  neglect 
the  nonlinear  and  the  viscous  terms  in  many  situations  and  work  with  the  linear 
inviscid  wave  equation.  Of  course,  there  are  limitations  to  the  linear  inviscid  theory. 
In  particular,  the  above  comparison  of  equation  2.155  assumes  a  free  traveling  wave, 
and  does  not  apply  inside  the  viscous  shear  boundary  layer  where  the  velocity  must 
decrease  to  zero  in  a  very  short  distance,  as  we  saw  in  section  2.5.3.  In  the  shear 
boundary  layer  the  viscous  terms,  such  as  v  d^uj dy^ ,  are  very  large. 

In  free  space  the  effect  of  nonlinearities  on  acoustic  waves  is  small  if  the  wave 
amplitude  is  much  smaller  than  the  speed  of  sound.  We  can  estimate  some  numbers 
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as  follows,  where  we  use  the  relation  UqIcs  =  p' / po  for  a  free  traveling  wave, 


Uq 

Uq/Cs 

pressure  level 

34  cm/s 

0.001 

137  dB 

344  cm/s 

0.01 

157  dB 

3440  cm/s 

0.1 

177  dB 

(2.159) 


Therefore,  nonlinear  effects  in  free  space  become  important  for  very  loud  sounds  only. 

An  example  of  a  typical  nonlinear  effect  is  frequency  doubling.  In  particular,  if 
we  have  a  sinusoidal  wave  of  frequency  cu. 


_  {2  TT  X / \  —  (jJ  t) 


(2.160) 


the  nonlinear  advective  term  produces  an  oscillation  of  twice  the  original  frequency, 

u{du/dx)  =  -  2cuf)  (2.161) 

A 

The  frequency  doubling  effect  is  one  of  the  reasons  why  the  linear  analysis  based 
on  complex  exponentials  does  not  work  in  the  nonlinear  regime.  Of  course,  there  are 
many  other  nonlinear  effects  that  we  do  not  understand,  and  we  can  not  even  identify 
them. 

Although  nonlinear  effects  are  weak  for  acoustic  waves  in  free  space,  this  is  not 
the  case  for  acoustic  waves  in  conhned  space.  In  particular,  a  wave  in  free  space 
is  typically  generated  by  a  small  source  where  the  acoustic  energy  is  concentrated 
initially  before  expanding  as  1/C  in  three-dimensional  space.  Therefore,  near  the 
source  the  wave  amplitude  is  large,  and  nonlinear  effects  can  be  very  important,  as 
in  the  case  of  the  air  jet  in  a  flue  pipe.  Nonlinear  effects  are  the  basic  mechanism  for 
amplihcation  of  sound  in  flue  pipes  (Verge94  [56],  Hirschberg94  [26]). 

Having  discussed  the  limitations  of  linear  acoustic  theory  in  this  section,  the  ques¬ 
tion  arises  whether  it  is  reasonable  to  try  to  distinguish  acoustic  waves  from  other 
variations  of  density  in  nonlinear  regimes.  This  question  is  important  in  the  computer 
simulations  of  flue  pipes,  and  is  discussed  below. 
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2.5.5  Distinguishing  acoustic  from  hydrodynamic 

In  the  subsonic  regime,  a  simple  rule  for  distinguishing  acoustic  waves  from  non¬ 
acoustic  flow  is  the  propagation  speed.  Acoustic  waves  propagate  at  the  speed  of 
sound  which  is  much  faster  than  the  speed  of  non-acoustic  or  hydrodynamic  flow. 
Hydrodynamic  flow  consists  of  vortices,  boundary  layers,  etc  that  are  slow-moving 
compared  to  sound  waves.  This  difference  in  speed  appears  distinctly  in  the  fre¬ 
quency  domain.  The  frequencies  of  acoustic  waves  are  typically  much  higher  than 
the  frequencies  of  non-acoustic  flow,  and  we  can  exploit  this  property  to  distinguish 
the  acoustic  waves  from  slower  hydrodynamic  variations  of  density.  In  the  computer 
simulations  and  in  the  physical  experiments,  a  time  series  of  the  density  is  obtained 
by  sampling  at  a  hxed  location  in  space.  Then,  the  sound  waves  are  identihed  as  the 
relatively  high  frequencies  in  the  spectrum,  and  hydrodynamic  flow  as  the  relatively 
low  frequencies  in  the  spectrum. 

The  above  distinction  between  acoustic  and  non-acoustic  motion  may  become 
blurry  in  regions  such  as  near  the  jet  of  a  recorder  flue  pipe,  where  acoustic  waves 
and  hydrodynamic  flow  interact  with  each  other  very  strongly.  The  difficulty  is  that 
the  amplitude  of  the  acoustic  motion  and  the  amplitude  of  the  hydrodynamic  motion 
are  comparable  with  each  other  near  the  jet.  The  oscillations  of  the  jet  generate 
acoustic  waves  and  are  also  driven  by  acoustic  waves,  so  that  the  two  motions  blur 
into  each  other  and  become  one.  This  is  not  surprising  because  the  acoustic  and  hy¬ 
drodynamic  regimes  are  simply  different  limits  (approximations)  of  one  flow  behavior 
that  is  described  by  the  Navier  Stokes  equations. 

Fortunately  in  the  case  of  flue  pipes,  the  strong  interactions  between  acoustic 
waves  and  hydrodynamic  flow  are  limited  to  the  region  of  the  jet  orihce  and  the 
labium  (the  edge  which  the  jet  impinges).  Thus,  a  little  further  away  from  this 
sensitive  region,  the  acoustic  waves  quickly  uncouple  from  the  slower  hydrodynamic 
flow,  and  we  can  use  the  simple  criterion  of  frequency  range  described  above  to 
distinguish  the  acoustic  waves  from  the  hydrodynamic  variations  of  density. 
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temperature 
degrees  centigrade 

density 

10“^  gm/cm^ 

kinematic  viscosity 
cm^/s 

speed  of  sound 
cm/s 

15 

1.226 

0.145 

34060 

20 

1.205 

0.150 

34290 

25 

1.184 

0.155 

34581 

Table  2.2:  Air-constants  at  various  temperatures. 

2.6  Appendix:  units  and  constants 

This  appendix  summarizes  the  units  and  constants  which  are  employed  in  the  com¬ 
puter  simulations.  The  speed  of  sound  is  chosen  equal  to  34400  cm/s,  and  the  kine¬ 
matic  viscosity  is  set  equal  to  0.15  cm^/s.  These  values  correspond  to  a  mean  constant 
temperature  of  22  degrees  centigrade.  Regarding  the  density,  the  units  of  mass  are  nor¬ 
malized  hy  0.0012  gm/ cm^  so  that  the  mean  density  is  unity.  Table  2.2  lists  typical 
values  of  the  mean  density  and  the  kinematic  viscosity  of  air  at  room  temperatures 
and  atmospheric  pressure.  These  values  are  taken  from  Newman  [34,  p.388].  The 
speed  of  sound  of  air  shown  in  table  2.2  is  calculated  using  the  following  formula  from 
Olson  [36,  p.lO], 

c,  =  33100  Vl  +  0.00366  T  (2.162) 

where  T  is  the  temperature  in  degrees  centigrade.  The  above  formula  for  the  speed 
of  sound  is  equivalent  to  equation  2.29  which  was  derived  in  section  2.3.  Note  that 
the  factor  0.00366  of  equation  2.162  is  equal  to  1/273,  and  273  +  T  gives  the  absolute 
temperature  in  degrees  Kelvin. 

In  order  to  compare  intensities  of  sound,  the  scale  of  decibels  of  sound  pressure 
level  is  used  (Sekuler&Blake  [45,  p.298]).  The  scale  of  decibels  is  dehned  as  the 
logarithm  of  the  ratio  of  pressure  fluctuation  P'  divided  by  a  normalizing  pressure 
fluctuation  P/  which  is  referred  to  as  the  standard  pressure  level, 

20  logio  ^ 

0 


(2.163) 
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The  standard  pressure  level  is  the  weakest  sound  that  an  average  human  can  hear, 
and  it  is  approximately, 

=  2  xlO“^  gm/(cms2)  (2.164) 

Using  the  relation  P'  =  the  following  formula  is  obtained, 

(^20  log^o  ^)  =  +  20  logio  c'do  (2.165) 

The  above  formula  is  useful  in  the  computer  simulations  where  the  normalized  density 
fluctuations  p'  j po  appears.  For  the  mean  density  of  air,  the  value  po  =  0.0012  gm/cm^ 
is  used. 

It  should  be  noted  that  the  results  of  two-dimensional  simulations  can  not  be 
related  exactly  with  the  three-dimensional  world;  in  particular,  the  two-dimensional 
density  has  units  gm/cm^  as  opposed  to  gm/cm^  for  the  three-dimensional  density. 
One  way  of  avoiding  this  problem  of  units  is  to  work  with  dimensionless  ratios  such  as 
pYpq.  Of  course,  the  problem  of  relating  2D  to  3D  results  involves  more  than  matching 
the  units.  For  example,  there  are  many  3D  effects  that  remain  un-modeled  in  2D, 
such  as  3D-expansion  of  waves  versus  a  2D-expansion,  and  also  vortex  stretching  in 
3D  space  (Tritton  [54,  p.ll4])  to  mention  a  few  (see  also  section  1.4  and  chapter  7). 

For  completeness,  a  few  dehnitions  of  dimensionless  numbers  are  summarized 
here.  Dimensionless  numbers  can  be  obtained  by  combining  characteristic  lengths  of 
the  flow  with  physical  constants  such  as  and  v  that  appear  in  the  Navier  Stokes 
equations.  For  example,  the  Mach  number  is  dehned  as  the  ratio  of  the  flow  speed 
divided  by  the  speed  of  sound, 

M  =  —  (2.166) 

The  flow  speed  U  in  the  above  equation  is  typically  the  maximum  speed  or  the 
mean  speed  of  the  flow.  In  the  case  of  subsonic  jet  phenomena  inside  flue  pipes,  the 
maximum  flow  speed  is  smaller  than  the  speed  of  sound  by  a  factor  of  10  to  1000, 
so  the  Mach  number  is  between  10“^  and  10“^. 
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Another  important  dimensionless  number  is  the  Reynolds  number  which  measures 
the  size  of  fluid  inertia  relative  to  the  size  of  viscous  effects  (Tritton  [54,  p.97]).  It  is 
given  by  the  ratio, 

Re  =  ^  (2.167) 

where  U  and  /  are  characteristic  velocity  and  length  scales  of  the  flow.  The  choice 
of  characteristic  scales  is  somewhat  arbitrary,  and  it  depends  on  the  geometry  of  the 
flow,  and  on  which  features  we  choose  to  focus  on.  For  example,  in  the  case  of  flow 
through  a  pipe  (Hagen-Poiseuille  flow,  Landau&Lifshitz  [32,  p.51]),  the  length  /  is 
typically  chosen  to  be  the  diameter  of  the  pipe,  and  the  speed  U  is  chosen  to  be  the 
mean  speed  of  the  flow.  In  the  case  of  jets  that  emerge  from  a  narrow  orihce,  such 
as  the  ones  of  section  1.4  and  chapter  7,  the  convention  is  also  to  choose  /  as  the 
diameter  of  the  orihce,  and  U  the  mean  speed  of  the  how. 

A  third  dimensionless  number  that  is  relevant  in  simulations  of  subsonic  jets  is  the 
Strouhal  number,  which  measures  the  relative  frequency  of  oscillation.  For  example,  if 
a  jet  executes  transverse  oscillations  relative  to  its  forward  motion,  then  the  Strouhal 
number  can  be  dehned  as  the  ratio  of  the  frequency  /  of  oscillations  multiplied  by 
the  diameter  /  of  the  jet,  and  divided  by  the  jet  speed  7/, 

St  =  ^  (2.168) 

Other  dimensionless  numbers  in  addition  to  the  above  can  be  found  in  standard 
textbooks  (Batchelor  [3]  and  Newman  [34])  and  in  specialized  areas  of  huid  mechanics. 

The  next  chapter  begins  the  discussion  of  numerical  methods. 


Chapter  3 

Numerical  methods  for  fluid  flow 


Except  for  special  cases,  the  Navier  Stokes  equations  of  the  previous  chapter  can 
not  be  solved  analytically.  Therefore,  numerical  methods  must  be  used.  Below,  the 
basic  ideas  of  hnite  difference  methods  are  reviewed.  Subsequently,  an  explicit  hnite 
difference  method  for  solving  the  compressible  Navier  Stokes  equations  is  described. 
Also,  an  explicit  hnite  difference  method  for  solving  the  incompressible  Navier  Stokes 
equations  is  described  which  is  used  for  numerical  testing  purposes  only.  Most  of 
the  ideas  presented  here  can  be  found  in  textbooks  of  computational  huid  dynamics. 
Some  results  which  are  not  easily  available  in  the  literature  (as  far  as  I  know)  are 
the  discussion  on  why  explicit  numerical  methods  are  appropriate  for  subsonic  how 
in  section  3.2.1,  and  the  analysis  of  the  CFL  (Courant-Friedrichs-Lewy)  condition  in 
section  3.3.2. 


3.1  Numerical  grids 

The  Navier  Stokes  equations  can  be  solved  numerically  by  introducing  a  numerical 
grid  in  space  and  time.  For  the  sake  of  simplicity,  only  the  spatial  dimensions  of 
the  grid  are  described  here.  To  include  a  time  dimension,  we  can  imagine  making 
copies  of  the  planar  grids  shown  in  hgure  3-1,  and  stacking  them  on  top  of  each  other. 
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Figure  3-1:  Three  simple  types  of  numerical  grids:  uniform  orthogonal,  curvilinear, 
non-uniform  orthogonal. 

A  numerical  grid  dehnes  a  discretization  of  spacetime,  and  replaces  the  continuous 
functions  of  density  and  velocity  p,  14,14  by  a  discrete  set  of  values  dehned  at  the 
nodes  of  the  grid  (namely,  the  points  where  the  grid  lines  of  hgure  3-1  intersect). 

Numerical  grids  are  distinguished  into  staggered  or  non-staggered  depending  on 
whether  the  fluid  variables  are  dehned  exactly  at  the  grid  nodes,  or  halfway  between 
the  grid  nodes.  For  example,  the  huid  velocity  can  be  dehned  halfway  between  the 
grid  nodes,  and  the  huid  density  can  be  dehned  exactly  at  the  grid  nodes.  This 
staggered  allocation  of  variables  has  advantages  in  some  cases  (Peyret&Taylor  [38]), 
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but  it  is  more  complex  to  deal  with  than  a  straightforward  non-staggered  grid  where 
all  the  variables  are  dehned  at  the  grid  nodes.  In  the  present  work,  non-staggered 
grids  are  employed  exclusively. 

Numerical  grids  can  be  distinguished  into  uniform  or  non-uniform.  For  example, 
the  grid  shown  at  the  top  of  hgure  3-1  is  a  uniform  orthogonal  grid,  and  the  grid 
used  in  section  4.1.1  is  a  uniform  hexagonal  grid.  In  the  present  work,  only  uniform 
grids  are  used  because  they  are  very  simple  to  program  and  highly-suited  for  parallel 
computing.  Another  reason  for  using  uniform  grids  is  that  the  lattice  Boltzmann 
method  works  only  with  uniform  grids  as  far  as  is  known  today.  The  only  way  to 
extend  lattice  Boltzmann  to  non-uniform  grids  is  to  employ  two  grids  of  different 
uniform  resolution  joined  together  via  interpolation  (a  technique  called  composite 
grids).  This  idea  for  lattice  Boltzmann  is  outlined  in  section  4.6.2. 

Non-uniform  grids  increase  the  resolution  (density  of  grid  points)  in  certain  re¬ 
gions,  while  decreasing  the  resolution  in  other  regions  where  the  flow  is  smooth  and 
not  much  is  happening.  Sometimes,  the  change  of  resolution  introduces  numerical  ar¬ 
tifacts.  To  minimize  the  artifacts,  the  resolution  of  the  grid  should  be  varied  smoothly, 
if  possible.  On  the  other  hand,  smoothness  does  not  guarantee  the  absence  of  arti¬ 
facts.  In  particular,  acoustic  waves  are  very  sensitive  to  changes  of  resolution,  and 
should  be  carefully  tested  in  regions  where  the  resolution  is  changing. 

Non-uniform  grids  include  the  composite  grids  mentioned  above,  and  also  curvi¬ 
linear  and  non-uniform  orthogonal  grids  which  are  shown  at  the  middle  and  bottom 
of  hgure  3-1.  In  the  case  of  curvilinear  grids,  a  coordinate  transformation  from  the 
curvilinear  space  to  a  uniform  orthogonal  space  (called  logical  space)  is  usually  em¬ 
ployed.  The  Navier  Stokes  equations  are  transformed  to  new  coordinates,  and  hnite 
differences  are  applied  to  the  transformed  equations  on  the  logical  grid  [52].  Curvi¬ 
linear  grids  are  often  designed  to  be  body-conforming  in  order  to  approximate  closely 
the  shape  of  smooth  boundaries  such  as  airfoils. 

An  alternative  to  coordinate  transformations  is  to  discretize  the  Navier  Stokes 
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equations  directly  in  the  physical  space  by  taking  hnite  differences  based  on  the  local 
spacings  around  each  grid  point  (Peyret&Taylor  [38,  p.326]).  Direct  discretization  in 
the  physical  space  is  typically  used  in  the  case  of  non-uniform  orthogonal  grids,  and 
also  in  the  case  of  unstructured  non-uniform  grids  described  below. 

Both  the  curvilinear  and  the  non-uniform  orthogonal  grids  of  hgure  3-1  are  well- 
structured  grids.  By  contrast,  there  are  also  unstructured  non-uniform  grids  (not 
shown  here)  where  the  grid  points  are  “layed-out”  with  almost  complete  freedom 
in  order  to  match  the  boundaries  and  the  areas  where  higher  resolution  is  needed 
(Camp&et  ah  [6]).  Unstructured  grids  are  very  popular  and  very  promising.  A  lot  of 
research  is  currently  being  done  to  hnd  good  ways  of  parallelizing  unstructured  grids. 

The  above  catalogue  of  numerical  grids  should  put  in  perspective  the  uniform 
grids  which  are  used  here.  Uniform  grids  are  not  the  most  efficient  grids  that  are 
possible,  but  they  are  very  simple  to  use,  and  very  easy  to  parallelize.  In  the  next 
section,  the  choice  between  explicit  and  implicit  methods  is  discussed. 

3.2  Explicit  versus  implicit 

Numerical  methods  for  fluid  dynamics  can  be  distinguished  into  explicit  and  implicit. 
In  the  case  of  an  explicit  method,  the  future  value  U(f  +  At,  x,  y)  of  a  fluid  variable 
V{t^x^y)  at  the  grid  point  (x^y)  depends  only  on  the  present  and  past  values  at 
neighboring  points.  In  other  words,  an  explicit  method  uses  only  local  interactions  to 
calculate  the  future  values  of  density  and  velocity.  An  example  of  an  explicit  method 
is  shown  graphically  at  the  left  side  of  hgure  3-2  with  the  time  axis  increasing  in  the 
vertical  direction.  Here,  the  future  value  of  the  central  node  depends  on  the  present 
value  of  the  central  node  and  also  on  the  present  values  of  the  four  neighbors. 

In  the  case  of  an  implicit  method,  the  future  value  V (f  +  At,  x,  y)  of  a  huid  variable 
V{t^x^y)  at  the  grid  point  (x^y)  depends  on  the  future  values  of  neighboring  nodes 
such  as  U(f  +  At^x  +  Ax,y  +  Ay)  (right  side  of  hgure  3-2).  This  implies  that  the 
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Figure  3-2:  Explicit  and  implicit  discretizations  in  two-dimensional  space  with  the 
time  axis  increasing  vertically. 

neighboring  node  V{t-\-  At,  x  Ax,  y  -\-  Ay)  depends  on  other  nodes  further  down  the 
grid  in  a  similar  way,  for  example  V{t  -\-  At,  x  2 Ax,  y  -\-  2 Ay).  Consequently,  an  im¬ 
plicit  method  couples  together  distant  nodes,  and  introduces  a  large  matrix  equation 
that  extends  the  length  of  the  numerical  grid.  The  matrix  equation  makes  implicit 
methods  more  stable  than  explicit  methods,  but  also  more  difficult  to  parallelize. 

Explicit  methods  are  very  simple,  ideally  scalable,  and  highly  suitable  for  large 
parallel  computers  with  small  communication  capabilities  (see  chapter  6).  However, 
explicit  methods  require  small  integration  time  steps  in  order  to  remain  numerically 
stable.  By  contrast,  implicit  methods  are  challenging  to  parallelize,  and  have  large 
communication  requirements.  However,  implicit  methods  can  use  much  larger  integra¬ 
tion  time  steps  than  explicit  methods.  Because  of  these  differences  between  explicit 
and  implicit  methods,  the  decision  of  which  method  to  use  depends  on  the  available 
computer  system  and  on  the  problem’s  requirements  regarding  the  integration  time 
step.  For  instance,  the  simulation  of  subsonic  flow  requires  small  integration  time 
steps  in  order  to  follow  the  fast-moving  acoustic  waves  (see  below).  Thus,  an  explicit 
method  is  generally  a  good  choice  for  subsonic  flow. 
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A  possibility  that  deserves  to  be  explored  in  the  future  is  intermediate  methods  be¬ 
tween  explicit  and  implicit.  By  this,  I  do  not  mean  semi-implicit  methods  where  some 
terms  of  the  Navier-Stokes  equations  are  discretized  implicitly  and  others  explicitly. 
Also,  I  do  not  mean  “alternating  directions”  (Peyret&Taylor  [38])  where  the  matrix 
equation  is  split  into  smaller  matrices  that  are  solved  in  succession:  hrst  along  the 
x-direction,  then  along  the  y-direction,  then  along  the  z-direction.  Such  approaches 
reduce  the  size  of  the  matrix  that  accompanies  an  implicit  method,  but  still  produce 
a  matrix  that  extends  the  length  of  the  numerical  grid,  and  presents  formidable  dif- 
hculties  for  parallel  computing.  Instead,  the  real  breakthrough  would  be  to  develop 
numerical  methods  that  have  the  stability  properties  of  implicit  methods  without  us¬ 
ing  matrices  that  extend  the  whole  grid.  Such  “intermediate”  methods  would  retain 
some  of  the  locality  of  explicit  methods  that  is  very  important  for  parallel  computing. 
An  effort  towards  this  direction  in  the  context  of  the  diffusion  equation  is  discussed 
in  [2]  and  references  therein. 


3.2.1  Small  integration  time  steps  for  subsonic  flow 

The  integration  time  step  At  in  simulations  of  subsonic  flow  must  be  small  both 
for  explicit  and  implicit  methods.  An  approximate  constraint  on  the  numerical 
speed  Ax ! At  of  explicit  methods  can  be  obtained  from  the  CFL  condition  (Courant- 
Friedrichs-Lewy)  which  says  that  the  domain  of  numerical  dependence  must  include 
the  domain  of  physical  dependence.  The  CFL  condition  must  be  satished  in  order 
to  be  able  to  simulate  the  physical  phenomenon.  In  the  case  of  simple  hyperbolic 
problems  (such  as  the  wave  equation),  it  can  be  shown  that  the  CFL  condition  is  also 
a  necessary  condition  for  the  stability  of  explicit  methods  (Courant&et  ah  [14]).  The 
CFL  condition  can  be  written  approximately  as  follows. 


(3.1) 
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where  is  the  propagation  speed  of  acoustic  waves.  In  other  words,  the  CFL  condi¬ 
tion  requires  that  the  numerical  speed  Axj  At  must  be  at  least  as  large  as  the  physical 
speed.  A  more  accurate  formula  for  the  CFL  condition  is  derived  in  sections  3.3.1 
and  3.3.2. 

In  the  case  of  implicit  methods,  the  CFL  condition  can  not  be  applied  directly 
because  the  matrix  of  an  implicit  method  introduces  dependencies  (interactions)  be¬ 
tween  distant  nodes  along  the  entire  length  of  the  numerical  grid.  Therefore,  the 
numerical  speed  of  an  implicit  method  is,  in  some  sense,  the  length  the  grid  divided 
by  At,  which  is  a  very  large  numerical  speed.  On  the  other  hand,  this  speed  can  not 
be  compared  with  the  physical  speed  of  acoustic  waves  in  a  meaningful  way  because 
the  matrix-introduced  interactions  are  not  physical  interactions.  Another  difficulty 
in  trying  to  interpret  the  CFL  condition  in  the  context  of  implicit  methods  is  that 
many  implicit  methods  are  known  to  be  unconditionally  stable  (Peyret&Taylor  [38]) 
under  linear  stability  analysis.  Therefore,  such  methods  can  compute  a  stable  solu¬ 
tion  (though  not  necessarily  accurate  or  correct)  even  when  the  time  step  At  is  much 
larger  than  the  CFL  limit.  All  this  shows  that  the  CFL  condition  is  inconclusive  in 
the  case  of  implicit  methods. 

An  approximate  constraint  on  the  numerical  speed  Ax  j At  of  implicit  methods 
can  be  obtained  by  inquiring  whether  the  computed  solution  simulates  accurately  the 
physical  phenomena  under  consideration.  In  the  case  of  acoustic  waves  that  propagate 
through  the  fluid  and  reflect  off  obstacles,  the  time  step  At  must  be  small  enough  to 
follow  the  propagation  of  acoustic  waves.  In  particular,  the  product  At  must  be 
less  than  a  few  Ax  in  order  to  have  enough  resolution  to  simulate  the  passage  and 
reflection  of  acoustic  waves. 

Ax  ~  CgAt  (3-2) 

The  above  constraint  arises  from  the  time-scales  of  the  problem,  and  applies  both  to 
implicit  and  explicit  methods. 

As  stated  earlier,  throughout  this  work  only  explicit  methods  are  used.  In  the  next 
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section,  an  explicit  finite  difference  method  is  described  for  solving  the  compressible 
Navier  Stokes  equations. 


3.3  Compressible  finite  difference  method 


Let  us  consider  a  uniform  orthogonal  grid  with  Ax,  Aj/,  Az,  At  intervals  in  space  and 
time.  For  the  sake  of  brevity,  only  two  spatial  dimensions  are  shown  here.  The  exten¬ 
sion  of  the  method  to  three  dimensions  is  straightforward.  The  following  abbreviated 
notation  is  used, 

Plk  =  Pi^o  +  jAx,  yo  +  kAy,  R  +  nAt)  (3.3) 


where  Xo,j/o  denote  the  space  coordinates  of  the  point  at  the  left-bottom  corner  of 
the  grid  according  to  a  Cartesian  coordinate  system,  and  to  is  the  starting  time  of 
the  integration.  Below,  variables  without  any  space  sub-indices,  for  example 
are  assumed  to  be  Also,  the  notation  u  =  and  v  =  Vy  is  used  to  avoid 

confusion  with  indices.  The  continuous  Navier- Stokes  equations  in  two  dimensions 
can  be  written  as  follows. 


^  djpu)  d{pv)  ^ 

dt  dx  dy 


(3.4) 


du  du 


dv 


dp 


a  +  "a;  +  "s;  +  “vai 


dv  dv 


dv 


dp 


m “IhT  "IFC  pdy 


-  =  0 


—  =  0 


(3.5) 


(3.6) 


If  we  use  the  following  difference  operators  (forward- Euler  for  time  and  symmetric 


differences  for  space). 


du 

SfU  = 

u^+^  -  u^ 

(3.7) 

~di 

At 

du 

+  '^j  —  l,k 

(3.8) 

dx 

2  Ax 

du 

'^j,k  —  l 

(3.9) 

dy 

2  Ay 
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u 


dx'^ 


8. 


xx'^j,k  — 


the  discretized  Navier-Stokes  equations  can  be  written  as  follows, 


(3.10) 


=  u^  +  At 

^n+l  =  +  At 


Ai  +  p''S,v;t' 

V  +  6„’‘lk)  - 


s  c  ^.nc  ^.n  ^.n  c  ^.n 


(3.11) 

(3.12) 

(3.13) 


Equations  3.12  and  3.13  produce  immediately  the  new  velocity  at  the  next  time  step 
t  +  At  because  all  the  terms  of  the  momentum  equations  are  discretized  explicitly 
(evaluated  at  time  t).  Equation  3.11  however  is  slightly  different  from  the  momentum 
equations.  The  mass  continuity  equation  is  discretized  in  a  semi-implicit  way  which 
means  that  the  velocity  values  at  time  t  +  At  are  used  to  compute  the  new  density 
value  pit  +  At)  at  time  t  +  At.  In  other  words,  the  computation  proceeds  in  two 
steps:  Eirst,  the  new  velocity  is  calculated,  and  then  the  new  density  is  calculated  in 
a  separate  loop.  This  two-step  procedure  is  very  important  for  numerical  stability. 
If  both  the  density  and  the  velocity  are  discretized  explicitly,  the  algebraic  system 
becomes  very  unstable.  This  can  be  easily  checked  in  numerical  experiments,  and  a 
plausible  theoretical  explanation  is  given  in  section  3.3.3. 


3.3.1  Numerical  stability 

Numerical  stability  conditions  for  the  explicit  finite  difference  method  (3.11  to  3.13) 
are  not  known  exactly.  However,  a  few  approximate  estimates  can  be  obtained.  Eirst, 
the  CEL  condition  says  that  the  domain  of  numerical  dependence  must  include  the 
domain  of  physical  dependence.  After  some  manipulations,  the  following  conditions 
are  obtained  (see  section  3.3.2  for  a  detailed  derivation), 

XT  -  r  +  '^sv^) 


(3.14) 
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or  more  generally, 


At  < 


'IKI  ^  |14| 

+  - K  C, 


1  1 
^  + 


-1 


(3.15) 


yAx  Ay  \]  Ax"^  Ay"^  J 
To  satisfy  the  above  equations,  the  time  step  of  integration  At  must  be  kept  very 
small  in  the  case  of  subsonic  flow  where  the  speed  of  sound  is  very  large. 

Another  stability  condition  that  takes  into  account  viscous  effects  can  be  derived 
as  follows.  We  consider  the  linear  advection-diffusion  equation  (Peyret&Taylor  [38, 
p.65])  which  is  simple  to  analyze,  and  is  a  special  case  of  the  momentum  Navier  Stokes 
equations.  The  advection-diffusion  equation  has  the  following  form. 


where  /  is  the  variable  that  is  diffused;  for  example,  the  fluid  momentum.  The 
coefficients  A  and  B  correspond  to  the  fluid  speed 


51=  |v;| 

B  =  ini 

and  they  are  assumed  to  be  constant  for  the  purpose  of  linear  analysis.  The  explicit 
discretization  of  equation  3.16  produces, 

fn  +  l  =fn_^t  +  BSyr  -  (3.17) 


By  applying  the  von  Neumann  stability  analysis  to  the  above  (see  section  3.3.3  for  a 
description),  we  get  the  following  constraints  (Peyret&Taylor  [38,  p.65])  in  the  case 
of  Ax  =  Aj/, 


At  < 


2u 


\V.?+\Vy 


and  also. 


vAt  1 
~  4 


(3.18) 


(3.19) 


Although  the  above  conditions  are  necessary,  they  are  not  sufficient.  The  simulation 
of  subsonic  compressible  flow  at  high  Reynolds  numbers  is  susceptible  to  slow-growing 
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Figure  3-3:  The  numerical  domain  of  dependence  for  Ax  =  Ay. 

numerical  instabilities  of  very  high  spatial  frequency.  These  stability  problems  are  dis¬ 
cussed  in  chapter  5,  and  they  can  be  avoided  by  including  artihcial  viscosity  (fourth- 
order  numerical  dissipation)  which  hlters  very  high  spatial  frequencies. 

3.3.2  Derivation  of  CFL  formula 

To  derive  the  CFL  stability  equation  3.14,  we  consider  a  node  with  four  neighbors 
in  a  square  grid  as  shown  in  hgure  3-3.  The  goal  is  to  compare  the  numerical  and 
the  physical  domains  of  dependence.  We  observe  that  the  four  neighbors  are  the  only 
nodes  that  can  influence  the  central  node  after  one  time  step  At.  Thus,  the  numerical 
domain  of  dependence  of  the  central  node  is  the  square  area  that  is  enclosed  by  straight 
lines  drawn  between  the  four  neighbors.  The  physical  domain  of  dependence  that 
arises  from  acoustic  waves  is  a  circle  of  radius  As  =  CgAt,  and  it  must  be  contained 
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Figure  3-4:  The  numerical  domain  of  dependence  for  Ax  ^  Ay. 


within  the  square  area.  Simple  geometry  shows  that  the  maximum  radius  As  is  given 
by  the  following  formula, 

As  =  ^(V2Ax)  (3.20) 

and  thus  we  must  have, 

c,At  <  ]^{V2Ax)  (3.21) 

^  >  c,V2  (3.22) 

Similarly,  the  physical  domain  of  dependence  that  arises  from  hydrodynamic  motion 
must  be  contained  within  the  numerical  domain.  Thus,  we  must  have, 

I)  >  lUI  (3.23) 

^  >  |V'.I  (3.24) 

A  simple  way  to  combine  all  of  the  above  inequalities  is  to  require  that  Ax  j At  is 
greater  than  the  sum  of  the  individual  positive  terms.  This  produces  the  inequality 
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that  we  wished  to  prove, 


/\  T" 

^>(IK|  +  |v;|  +  c,v^) 


(3.25) 


The  more  general  CFL  stability  equation  3.15  is  derived  in  a  similar  way.  We 
consider  the  numerical  domain  shown  in  hgure  3-4  that  has  a  rhombic  shape.  We 
have  the  following  geometric  relations,  where  a  is  the  length  shown  in  hgure  3-4, 


=  Ay^ 

Ax^  +  Aj/2  —  =  Ax'^ 

After  some  algebra  we  can  obtain. 


A  /  1  1 

As  =  I  — ,  + 


-1 


(3.26) 

(3.27) 

(3.28) 


^Ax^  Aj/^y 

The  physical  domain  of  dependence  that  arises  from  acoustic  waves  must  be  smaller 
than  the  numerical  domain.  Thus,  we  must  have. 


As  >  c,Af 


or  equivalently. 


Ar^  >  c. 


1 


+ 


1 


We  must  also  satisfy  the  hydrodynamic  constraints, 

|v;i 


Ar^  > 
Ar^  > 


Ax 

M 

Ay 


(3.29) 

(3.30) 

(3.31) 

(3.32) 


The  above  inequalities  can  be  combined  additively  to  produce  the  inequality, 

-1 


^  ^  1  (3-33) 

Ax^  Ay^  I 


This  is  the  general  form  of  the  CFL  condition  in  two  dimensions  for  an  explicit 
numerical  method  that  employs  nearest  neighbor  interactions. 
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3.3.3  Semi-implicit  density 


An  explanation  why  a  semi-implicit  discretization  of  the  continuity  equation  leads  to 
better  stability  properties  than  a  fully  explicit  discretization  is  as  follows.  Let  us  write 
the  discretized  Navier-Stokes  equations  in  one-dimensional  form  for  simplicity.  We 
write  the  mass  continuity  equation  and  the  momentum  conservation  equation  along 
the  x-direction  as  follows, 


^n+l  _ 


=  —  At  p 


Nj+1  “j-i 


n+l  CjH 


(3.34) 


=  n”  +  At  vp 


n  n  _  n  n  _  n 

,n  +  l  _  ,,n  I  Ar  ^  ^J-1  „2fU  +  l  dj-1  +  l  A-1 


(3.35) 


[  ^  *  p^2Ax  2Ax  \  '  ' 

Equation  3.34  is  a  semi-implicit  discretization  of  the  continuity  equation.  To  compare, 
an  explicit  discretization  is  as  follows. 


w+i  — 


=  p^  —  At  p 


“j-i  I  P3-1 


(3.36) 


We  now  apply  the  von-Neumann  frequency  analysis  (Peyret&Taylor  [38,  p.344]).  We 
write  the  different  variables  in  terms  of  their  frequency  components,  and  we  analyze 
each  frequency  separately  (non-linear  combinations  of  frequencies  are  ignored).  We 
have, 

pU  _  ^iKQX 

pU+i  ^ 

u”  = 

where  A  is  the  velocity  amplitude,  and  Go,  Gi  are  the  growth  factors  corresponding  to 
the  spatial  frequencies  Kq,  of  the  density  and  velocity  respectively.  The  imaginary 
unit  of  complex  numbers  is  denoted  by  z  =  \/—T,  and  it  should  not  be  confused  with 
indices  because  i  is  never  used  as  an  index  here.  The  following  identities  are  very 
useful, 

Sxp^  =  p”  iAx~^  sin(/€oAx) 

^xxP^  =  p”  2  Aa;“^(cos(/€oAa;)  —  1) 


3.37 


(3.38) 
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Below,  the  following  dimensionless  constants  are  used  for  brevity, 

Cl  =  {AtlAx)A 

(2  =  (uAt/Ax'^)  (3.39) 

Cs  =  {AtlAx){cllA) 

If  we  substitute  the  exponentials  of  equation  3.37  into  equation  3.34  and  3.35,  we 
obtain  the  following  equations, 

Go  =  1  —  z  GiCi  e*^^^(sin  kqAx  +  sin  Ki Ax)  (3.40) 

Gi  =  1  +  2C2(cos  Ki  Ax  —  1)  —  z  Cs  e  sin  zzqAx  —  z  Ci  sin  zziAx  (3.41) 

By  contrast,  the  explicit  discretization  of  the  continuity  equation  produces  the  fol¬ 
lowing. 

Go  =  1  —  z  Cl  e*^^^(sin  zzoAx  +  sin  z€i  Ax)  (3.42) 

A  necessary  condition  for  stability  is  that  the  magnitude  of  each  growth  factor  indi¬ 
vidually  Go,  Gi  should  not  be  larger  than  unity  for  all  possible  frequencies  zzo,  ^i-  The 
largest  frequency  that  is  possible  on  a  grid  of  spacing  Ax  corresponds  to  a  wavelength 
of  2 Ax  (2  nodes  per  cycle), 

0  <  Ko,Ki  <  — —  (3.43) 

Ax 

Different  choices  of  zzo,  ^1  within  the  above  range  can  be  substituted  in  equations  3.40, 
3.41,  and  3.42,  3.41  to  derive  stability  conditions.  The  algebra  is  rather  complicated, 
and  is  omitted  here.  Instead,  we  notice  that  the  Go  factor  of  the  semi-implicit  version 
(equation  3.40)  is  almost  identical  to  the  Go  factor  of  the  explicit  version  (equa¬ 
tion  3.42)  except  for  the  extra  Gi.  In  the  explicit  version,  the  magnitude  of  Go  is 
always  greater  than  unity,  but  in  the  semi-implicit  version  the  magnitude  of  Go  can 
be  less  than  unity  because  of  the  extra  Gi.  A  complete  analysis  requires  carrying  out 
the  complex  multiplications,  collecting  terms,  considering  the  variation  of  in 

space,  etc.  The  above  preliminary  analysis  gives  a  basic  idea  of  why  the  semi-implicit 
version  can  be  expected  to  be  more  stable  than  the  explicit  version,  a  fact  which  can 
be  easily  observed  experimentally. 
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3.3.4  Boundary  conditions 


The  modeling  of  boundaries  is  a  very  important  part  of  a  numerical  method.  The 
boundaries  include  the  internal  obstacles  and  the  perimeter  that  encloses  the  simu¬ 
lated  region  (it  should  be  noted  that  periodic  boundaries  are  not  useful  in  the  case  of 
flue  pipes).  Near  a  boundary,  the  numerical  method  must  take  into  account  the  fact 
that  grid  points  are  available  only  on  the  interior  side  of  the  boundary.  For  instance, 
the  symmetric  differences  which  are  used  at  the  interior  nodes  (equation  3.8)  must 
be  replaced  with  asymmetric  differences  at  the  boundary  nodes.  Furthermore,  the 
numerical  boundary  conditions  must  be  chosen  properly  to  model  the  desired  physical 
conditions  such  as  a  non-slip  wall,  an  inlet,  and  an  outlet. 

A  non-slip  wall  means  that  the  velocity  variables  14,  Vy  are  always  equal  to  zero; 
therefore,  only  the  density  needs  to  be  calculated  at  a  non-slip  wall.  The  approach 
which  is  used  in  the  simulations  of  flue  pipes,  is  to  compute  the  density  p  by  applying 
asymmetric  hnite  differences  to  the  continuity  equation.  In  particular,  the  central 
differences  of  equation  3.8  are  replaced  with  asymmetric  differences  denoted  by  8^- 
and  as  follows. 


^  c  _ 

dx  ^  “ 

^  c  _ 

a  ^  P  — 

OX 


~  +  Pj-2,k 

2  Ax 

—  ‘iP],k  +  +  —  Py  +  2,k 

2  Ax 


(3.44) 

(3.45) 


and  similarly  for  the  y-directions. 

An  alternative  approach,  which  is  not  used  in  the  simulations  of  flue  pipes,  is  to 
compute  the  density  at  a  non-slip  wall  by  simple  extrapolation  in  a  normal  direction 
to  the  boundary  wall.  Preliminary  experiments  which  I  have  performed,  indicate 
that  in  the  case  of  non-slip  walls,  the  continuity  equation  with  asymmetric  differences 
works  better  than  extrapolating  the  density.  However,  the  extrapolation  approach  is 
described  here  for  completeness.  Fxtrapolation  amounts  to  setting 


p{xb)  =  p{xb  -  Ax) 


(3.46) 
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where  xb  is  the  boundary  wall.  The  justihcation  for  the  extrapolation  condition 
comes  from  considering  the  momentum  Navier  Stokes  equation  at  the  wall, 


du  du 


dv 


yw  +  u—  h  V—  h  —  0 

Ot 


dx  dy  *  pdx 
Since  n  =  n  =  0,  most  of  the  above  terms  vanish,  and  we  obtain, 

2  dp 


u 


—  V- 


=  0 


(3.47) 


(3.48) 


®  pdx  dx"^ 

The  speed  of  sound  is  very  large  compared  to  the  flow  speed  u.  Thus,  it  makes  sense 
to  approximate  the  above  with  the  condition  dpj dx  =  0  which  gives  the  extrapolation 
condition  for  the  density  at  a  non-slip  wall. 

There  are  also  other  approaches  for  calculating  the  density  at  the  boundary  (more 
sophisticated  than  the  above),  and  some  of  them  are  described  in  Poinsot&Lele  [39]. 
After  some  algebra,  it  is  possible  to  show  that  the  formulas  of  Poinsot&Lele  [39]  in 
the  case  of  a  non-slip  wall  are  equivalent  to  applying  asymmetric  differences  to  the 
continuity  equation  with  the  addition  of  some  correction  terms  which  are  proportional 
to  the  Mach  number;  hence,  they  are  small  in  the  case  of  subsonic  flow.  Because  the 
correction  terms  introduce  complexity  and  additional  hnite  differencing,  I  do  not  use 
them  in  the  simulations  of  flue  pipes. 

Boundary  conditions  for  modeling  an  inlet  and  an  outlet  are  discussed  later  in 
section  7.3.  Below,  a  hnite  difference  method  for  simulating  incompressible  how  is 
described. 


3.4  Incompressible  finite  difference  method 

The  incompressible  hnite  difference  method  described  here,  is  employed  for  numerical 
testing  purposes  only.  In  particular,  in  sections  4.4  and  4.5  the  numerical  accuracy 
of  the  lattice  Boltzmann  method  is  tested  on  huid  hows  that  have  exact  analytic 
solutions.  These  exact  solutions  assume  a  perfectly  incompressible  how,  and  they 
ignore  acoustic  waves.  To  compare  the  lattice  Boltzmann  method  with  methods 
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specifically  designed  for  perfectly  incompressible  flows,  the  following  incompressible 
finite  difference  method  is  used.  The  continuity  equation  2.52  is  replaced  with  the 
divergence-free  condition  for  the  velocity  held, 


^  ^  ^ 

dx  dy  dz 


(3.49) 


The  momentum  equations  remain  as  before,  namely. 


dV  dV  dV  dV  dP 

wyr  +  14— - h  Vy— — h  14— - h  w - nV  14  =  0 

ot  ox  oy  oz  ox 

dV  dV  dV  dV  dP 

^  +  14^  +  Vy^  +  +  T-  UV^Vy  =  0 


(3.50) 


(3.51) 


dt  '  dx  '  ^  dy  '  dz  '  dy 

To  advance  the  solution,  the  momentum  equations  are  discretized  explicitly;  while  the 
pressure  term  is  omitted  when  calculating  the  hrst  estimate  of  the  velocity.  Then, 
the  velocity  estimate  is  corrected  in  order  to  satisfy  incompressibility  by  solving  a 
Poisson  equation. 


dV* 

(3.52) 

where  V*  is  the  hrst  estimate  of  the  velocity,  and  the  Einstein  summation  is  implied. 
The  above  Poisson  equation  computes  the  part  of  the  velocity  held  that  has  non-zero 
divergence,  which  is  then  subtracted  from  the  initial  velocity  estimate  to  obtain  a 
divergent-free  velocity  as  follows. 


U{t  +  At)  =  4* 


dcf) 

dx. 


(3.53) 


The  correction  of  the  velocity  can  also  be  view  as  a  projection  of  the  initial  velocity 
held  onto  the  space  of  divergent-free  velocity  helds.  Accordingly,  this  method  is  called 
a  projection  method.  The  projection  takes  into  account  the  pressure  effects  that  were 
omitted  in  the  hrst  estimate  of  the  velocity  (Peyret&Taylor  [38,  p.l60]).  In  addition, 
the  solution  of  the  Poisson  equation  provides  an  estimate  of  the  pressure  at  the  current 
time-step  as  follows. 


P 


(3.54) 
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In  the  numerical  tests  of  sections  4.4  and  4.5,  the  Poisson  equation  is  solved  with 
Successive-Over- Relaxation  (SOR)  [40,  page  680]  using  an  orthogonal  non-staggered 
grid.  Also,  forward- Euler  is  used  to  estimate  the  time  derivative,  and  centered  differ¬ 
ences  (3-point  symmetric)  are  used  to  calculate  the  spatial  derivatives. 

In  the  next  chapter,  the  lattice  Boltzmann  method  for  simulating  subsonic  com¬ 
pressible  flow  is  presented. 


Chapter  4 

The  lattice  Boltzmann  method 


The  lattice  Boltzmann  (LB)  method  is  a  numerical  scheme  for  simulating  viscous 
compressible  flow  in  the  subsonic  regime  (Koelman  [29],  Qian  [41],  Chen  [10]).  In 
this  chapter,  the  LB  method  is  analyzed,  and  two  major  results  are  presented:  the 
development  of  a  new  technique  for  accurate  boundary  and  initial  conditions  for  the 
LB  method,  and  the  demonstration  that  the  LB  method  is  second-order  accurate  in 
space  and  in  time. 

In  the  next  section,  the  basic  LB  algorithm  is  reviewed,  and  the  hexagonal  7-speed 
LB  model  is  described.  The  7-speed  model  has  the  smallest  number  of  populations 
Fi  that  are  necessary  to  give  correct  Navier  Stokes  in  two  dimensions.  Because  of 
its  simplicity,  the  7-speed  model  is  used  in  all  the  theoretical  discussions  here.  In 
section  4.2,  techniques  for  accurate  boundary  and  initial  conditions  for  the  LB  method 
are  analyzed.  In  section  4.3,  the  9-speed  LB  model  for  2D  orthogonal  grids,  and  also 
the  15-speed  LB  model  for  3D  orthogonal  grids  are  described. 

In  sections  4.4  and  4.5,  the  numerical  accuracy  of  the  LB  method  is  tested  ex¬ 
perimentally  on  initial  and  on  boundary  value  problems.  The  LB  method  is  shown 
to  be  second-order  accurate  in  space  and  in  time.  Also,  the  LB  method  is  compared 
against  an  explicit  hnite  difference  method  for  incompressible  flow.  In  section  4.6.1, 
the  modeling  of  non-slip  wall  and  the  calculation  of  density  at  a  non-slip  wall  are  dis- 
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cussed.  In  section  4.6.2,  an  approach  for  developing  composite  grids  (grids  of  different 
resolution  joined  together)  for  the  LB  method  is  outlined. 

There  is  also  an  appendix  where  the  numerical  roundoff  error  of  the  LB  method 
is  analyzed  (section  4.7.1),  and  the  relationship  between  lattice  gas  and  lattice  Boltz¬ 
mann  is  discussed  (section  4.7.2). 


Figure  4-1:  The  8  moving  populations  of  the  orthogonal  lattice  Boltzmann  method. 


Figure  4-2:  The  6  moving  populations  of  the  hexagonal  lattice  Boltzmann  method. 
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4.1  Basics  of  lattice  Boltzmann 

The  ideas  behind  the  lattice  Boltzmann  approach  (and  lattice  gas  of  section  4.7.2) 
come  from  the  kinetic  theory  of  gases.  According  to  kinetic  theory,  the  dynamics  of 
flows  at  length  scales  comparable  to  the  mean  free  path  are  described  by  a  Boltzmann 
equation, 

K  +  it.V/  =  (-l/r)(/-/'’)  (4.1) 

where  f{x,  n,  t)  represents  the  density  of  particles  inside  an  inhnitesimal  volume  (T,  x-\- 
dx)  with  velocity  (n,  v-\-dv)  at  time  t.  The  left-hand  side  of  equation  4.1  represents  the 
advection  of  particles  with  velocity  n,  and  the  right-hand  side  represents  the  collision 
between  particles.  The  collision  operator  of  equation  4.1  is  known  as  the  BGK  [4] 
relaxation  with  time  constant  r  towards  local  equilibrium  (typically,  a  Maxwell- 
Boltzmann  equilibrium).  ^  Starting  from  the  Boltzmann  equation  4.1  which  describes 
flow  at  microscopic  scales,  it  is  possible  to  derive  the  Navier  Stokes  equations  which 
describe  flow  at  macroscopic  scales  (at  least  100  times  the  mean  free  path).  Such  a 
derivation  requires  a  suitable  averaging  of  the  Boltzmann  equation  over  all  possible 
velocities,  and  also  a  Chapman- Enskog  expansion  (see  section  4.1.2). 

The  lattice  Boltzmann  method  takes  the  structure  of  the  Boltzmann  equation  4.1 
and  the  ideas  of  kinetic  theory,  and  applies  them  to  macroscopic  length  scales  using  a 
discrete  set  of  velocities  instead  of  a  continuous  set  of  velocities  v.  Despite  the  coars¬ 
ening  of  length  scale  and  the  discrete  set  of  velocities,  the  lattice  Boltzmann  method 
manages  to  produce  the  Navier  Stokes  equations  in  a  similar  way  that  kinetic  theory 
does.  The  key  ingredients  that  make  the  kinetic  approach  work,  are  the  advection 
of  particles  and  the  collision  of  particles  (relaxation)  conserving  mass,  momentum, 
and  energy.  An  additional  feature  is  that  the  discrete  set  of  velocities  requires  a 
highly-symmetric  lattice  (grid)  on  which  the  particles  can  move  [20,  15,  58]. 

In  two  dimensions,  typical  lattices  are  the  hexagonal  and  the  orthogonal  lattices 

^More  complex  collision  operators  can  be  used  also  to  describe  2-pair,  3-pair,  etc  interactions 
between  particles  (this  leads  to  the  BBGKY  hierarchy  of  equations  [27,  p.65]). 
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shown  in  figures  4-1  and  4-2.  In  the  lattice  Boltzmann  method,  each  node  of  the 
lattice  is  associated  with  a  set  of  moving  particles  or  populations  Fi.  The  fluid 
variables  p^Vx^Vy  can  be  obtained  from  the  Fi  via  a  simple  summation  at  each  fluid 
node, 

P  =  Y.F, 
pV  =  c 

where  C  are  the  discrete  velocities  of  the  lattice;  for  example,  on  a  hexagonal  lattice, 

^  Ax  (  27r{i  —  1)  .  27r(z  —  1) 

At  \  6  6 


4.2 


where  Ax  is  the  lattice  spacing  (distance  between  neighboring  nodes)  and  At  is  the 
integration  time  step.  Each  node  in  a  hexagonal  lattice  has  6  nearest  neighbors, 
and  the  simplest  lattice  Boltzmann  method  has  6  moving  populations  at  each  node. 
These  populations  are  shifted  (advected)  from  one  lattice  site  to  another,  and  are 
relaxed  towards  local  equilibrium  by  means  of  a  collision  operator  which  conserves 
mass,  momentum,  and  energy  just  like  a  particle  collision.  The  evolution  equation  is 
as  follows, 

Fi{x  +  Ci  At,  t  +  At)  =  Fi{x,t)  +  Ci  (4-4) 


where  Ci  is  the  collision  operator,  and  the  left-hand  side  is  the  advection  of  pop¬ 
ulations  in  discrete  space  and  discrete  time.  Each  evolution  cycle  consists  of  one 
advection  and  one  relaxation,  and  corresponds  to  one  integration  time  step  At  of  the 
LB  method. 

There  are  a  number  of  ways  of  implementing  a  suitable  collision  operator.  One  ap¬ 
proach  is  to  multiply  the  vector  of  the  old  populations  Fi  by  a  suitable  collision  matrix 
in  order  to  produce  the  vector  of  the  new  populations  (Gunstensen&RothmanQl  [22], 
Vergassola  [55],  Higuera  [25]).  A  simpler  approach  is  to  apply  a  relaxation  to  each 
population  Fi  with  a  time  constant  r  (the  BGK  operator  of  equation  4.1), 

(A,  -  Fr) 


(relaxed  Fi)  =  Fi 


T 


(4.5) 
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The  evolution  equation  becomes  as  follows, 

( p.  _ 

F,{x  +  C  At,  t  +  At)  =  F,{x,t) - ^ ^ —  (4.6) 

T 

The  BGK  relaxation  is  the  simplest  collision  operator  that  can  produce  Navier  Stokes 
in  the  subsonic  limit  (the  ratio  of  the  flow  speed  divided  by  the  speed  of  sound  must 
be  small).  To  conserve  mass  and  momentum,  the  equilibrium  populations  must 
be  chosen  so  that 

=  EF, 

=  EF^C 

A  few  additional  requirements  on  the  equilibrium  populations  Ff^  are  described  in 
the  next  section.  These  additional  requirements  together  with  mass  and  momentum 
conservation  are  sufficient  to  make  the  lattice  Boltzmann  method  approximate  the 
Navier  Stokes  equations. 

It  should  be  noted  that  the  mapping  from  the  populations  Fi  to  the  fluid  vari¬ 
ables  p,  14,  Vy  is  simple  (equation  4.2).  However,  the  inverse  mapping  from  the  fluid 
variables  p,Vx,Vy  to  the  populations  Fi  is  not  as  simple.  The  inverse  mapping  is 
not  needed  for  the  basic  LB  algorithm,  but  is  useful  for  implementing  initial  and 
boundary  conditions  as  explained  in  section  4.2. 

4.1.1  Hexagonal  7-speed  model  (d2q7) 

The  hexagonal  7-speed  lattice  Boltzmann  method  is  described  in  detail  here.  It 
is  denoted  “d2q7”  following  the  naming  convention  of  Qian  [41].  We  consider  a 
hexagonal  lattice  (see  hgure  4-2)  with  six  moving  populations  denoted  by  Fi  i  = 
1, .  .  .  ,  6  and  one  rest-particle  population  denoted  by  Fq.  The  non-moving  population 
Fq  stays  hxed  at  each  node  and  undergoes  only  relaxation  (collision)  at  every  step. 
At  startup,  the  populations  Fi  are  initialized  from  the  fluid  variables  p,Vx,Vy  (see 
section  4.2).  After  initialization,  successive  steps  of  relaxation  and  advection  are 
performed  to  calculate  the  Fi  and  the  fluid  variables  p,Vx,Vy  at  later  times.  The 
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relaxation  and  advection  steps  are  described  by  the  following  formulas, 

F,{x  +  CAt,  t  +  At)  =  F,{xO)  +  {-1/t)  [F,{xO)  -  FC'^ixO)] 

Fo{x,t  +  At)  =  Fo(f,t)  +  (-l/r)  [Fo(f,t)  -  (4.8) 

z  =  1, . . .  ,6 


1  4Af  V 

T  = - 1 - 

2  Ax2 

The  relaxation  parameter  r  is  chosen  to  achieve  the  desired  kinematic  viscosity  n 
given  the  space  and  time  discretization  parameters  Ax,  At.  The  vector  C  stands  for 
the  six  velocity  directions  of  the  hexagonal  lattice. 


^  Ax  (  27r(z  —  1)  .  27r(z  —  1) 

e,'  =  — —  cos - ,  sm - 

At  \  6  ’  6 


(4,9) 


(4.10) 


The  velocity  y(x,  f)  and  density  /o(x,  f)  are  computed  from  the  populations  Fi[xO) 
using  the  relations, 

p{xA)  =  Y!1=oF,{xA) 
p{xO)V{xO)  =  Tn=i  F,{xO)C 
The  variations  of  density  around  its  mean  value  (spatial  mean  which  is  constant  in 
time)  provide  an  estimate  of  the  fluid  pressure  P(x,  f),  according  to  the  following 
equation, 

P[x,t)  =  cl  {p{x,t)-  <p>  )  (4.11) 


The  speed  of  sound  is, 

Cs  =  y^3wo  {Ax/ At)  (4.12) 

where  the  coefficient  Wq  is  discussed  below.  The  equilibrium  populations  FC^[xO) 
are  given  by  the  following  equations, 

p^,®'i(T,t)  =  p{xO)  ^Wo  +  Wi{C  ■  V)  +  W2o{C  ■  V){C  ■  V)  +  W2l{V  ■  V)^ 

Fo'''^{x,t)  =  p{x,t)  ^zo  + Z2i{V  ■V)'^ 

6wo  +  zo  =  1  , 

Wi  =  l/{3c^)  ,  ZC20  =  2/(3c‘^)  ,  w2i  =  -1/{Qc^) 

Z21  =  —1/Z  .  c=  Ax/^f 


(4.13) 
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The  above  coefficients  are  chosen  so  that  the  Chapman- Enskog  expansion  of  the  evo¬ 
lution  equation  4.8  matches  the  Navier  Stokes  equations  (section  4.1.2).  In  particular, 
the  coefficient  Wi  is  determined  from  momentum  conservation,  the  coefficient  W20  is 
determined  from  Galilean  invariance  (ie.  the  advection  term  (I4  dVxjdx  +  Vy  dVxj dy) 
must  appear  in  the  Chapman- Enskog  expansion  with  a  constant  factor  equal  to  one), 
the  coefficient  W2\  is  chosen  to  eliminate  the  iV  ■  V)  dependence  of  the  pressure,  and 
the  coefficient  Z21  is  chosen  to  eliminate  the  iV  ■  V)  term  in  the  mass  conservation 
equation.  There  is  some  freedom  in  choosing  the  remaining  coefficients  Wq  and  Zq, 
but  they  must  satisfy  6  Wq  Zq  =  1  to  conserve  mass,  and  they  must  be  positive  for 
stability  purposes.  A  simple  choice  is  Wq  =  Zq  =  (1/7). 

The  computational  cycle  of  the  lattice  Boltzmann  method  is  organized  as  follows: 
The  current  lattice  populations  i/(T,  f)  are  used  to  calculate  the  velocity  held  V{xO) 
and  density  held  p{xO)  according  to  equation  4.10.  These  helds  are  the  numerical 
solution  at  time  f,  and  they  are  also  used  to  compute  the  equilibrium  populations 
t)  which  are  needed  to  advance  the  solution.  The  equilibrium  populations 
FC^[xO)  are  used  to  relax  the  Fi[xO)  into  “relaxed”  populations  which  are  then 
advected  according  to  equation  4.8  to  produce  the  lattice  populations  at  the  next 
time  step.  Then  the  cycle  repeats. 

4.1.2  Chapman- Enskog  expansion 

The  Chapman- Enskog  expansion  is  outlined  here.  The  goal  of  the  Chapman- Enskog 
expansion  is  to  derive  a  set  of  partial  differential  equations  in  terms  of  p  and  pV  that 
describe  the  behavior  of  the  lattice  Boltzmann  fluid  in  the  limit  of  Ax,  At  going  to 
zero.  During  the  Chapman- Enskog  expansion,  it  is  assumed  that  the  ratio  Ax/ At  =  c 
is  constant,  and  that  the  ratio  {V / c)  is  small  where  V  is  the  macroscopic  speed  of 
the  fluid.  The  hnal  result  of  the  Chapman- Enskog  expansion  is  the  mass  continuity 
equation  and  the  Navier  Stokes  momentum  equations. 

The  hrst  step  is  to  Taylor-expand  the  population  variable  i/(x  +  e)'AC  t  +  At)  in 
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the  evolution  equation  4.8  around  the  point  (x,  t).  This  produces  an  equation  whose 
left-hand  side  is  a  Taylor  series  and  whose  right-hand  side  is  equal  to  [  —  l/T)[Fi  —  FC^). 
This  equation  has  the  following  form, 


(d  \  AC  (d  \ 

Ai  +e',.v)  C.  +  —  (^  +e',.v) 


(C.-C,"') 


(4.14) 


The  second  step  is  to  combine  the  Taylor  series  equation  4.14  with  the  mass  and 
momentum  conservation  relations  (equation  4.10).  This  produces  three  equations 
whose  left-hand  sides  are  Taylor  series  and  the  right-hand  sides  vanish  because  the 
equilibrium  populations  are  chosen  to  satisfy  mass  and  momentum  conservation 
(for  example  The  three  Taylor  series  that  are  derived  in  this  way 

contain  partial  derivatives  of  quantities  that  are  sums  and  tensors  of  the  populations 
Fi.  The  equations  have  the  following  form  to  hrst  order, 

d{j2Fi)ldt  +  V-(^e,F0  +  ...  =  0  (4.15) 

0  1 

6  6 

+  V  •  (E  +  •  •  •  =  0  (4.16) 

1  1 

If  the  mass  equation  is  truncated  to  hrst-order  terms  in  the  derivatives,  the  resulting 
equation  contains  only  sums  of  Fi  and  no  tensors.  The  sums  of  Fi  can  be  converted 
easily  to  p  and  pV ^  and  this  produces  the  mass  continuity  equation.  The  momentum 
equation  must  be  truncated  to  second-order  terms  in  the  derivatives  to  produce  the 
Navier  Stokes  equations.  This  is  necessary  because  second-order  spatial  derivatives 
contribute  to  the  viscosity  of  the  fluid. 

A  complication  arises  with  the  pressure  tensor  (X)  CCFi)  which  appears  in  the 
momentum  equation  4.16.  The  pressure  tensor  can  not  be  expressed  in  terms  of  p 
and  pV  without  introducing  an  approximation  of  the  Fi  in  terms  of  p  and  pV .  This 
approximation  is  necessary  in  the  mass  equation  also  if  we  include  high-order  terms 
in  the  mass  equation. 

The  Chapman- Enskog  expansion  approximates  the  populations  ih(T,  f)  with  the 
equilibrium  populations  FC^[x^  t)  to  zero  order.  Then,  a  correction  is  added  to  hrst 


CHAPTER  4.  THE  LATTICE  BOLTZMANN  METHOD 


121 


order, 

R{xO)  =  FC^ixO)  +  fP{xO)  (4.17) 


and  so  on.  The  approximation  of  the  Fi  can  be  viewed  as  another  series  expansion 
that  is  used  in  parallel  with  the  Taylor  series  expansion.  To  retrieve  the  Navier  Stokes 
equations,  it  is  sufficient  to  calculate  up  to  hrst  order  FC^  +  while  keeping  up  to 
second-order  terms  in  the  Taylor  series,  as  stated  previously,  in  order  to  retrieve  all 
the  viscosity  terms. 

The  correction  term  is  computed  from  FC^  using  the  evolution  equation  4.8 
Taylor-expanded  to  hrst-order  with  the  Fi  replaced  by  the  zero-order  estimate  FC^ 
as  follows. 


T  At 


dFC^ 

dt 


+  Ci  ■  'V Fi 


eq 


(4.18) 


The  accuracy  of  the  Fi  approximation  improves  as  {V / c)  becomes  smaller.  The  above 
can  be  used  to  replace  Fi  with  FC^  +  F/^^  in  the  momentum  equation  4.16. 
Further,  we  express  the  FC^  in  terms  of  p  and  pV  in  order  to  derive  two  partial 
differential  equations  in  terms  of  p  and  pV  corresponding  to  momentum  conservation. 
By  choosing  the  formulas  of  the  equilibrium  populations  Ff^  appropriately,  we  can 
make  the  momentum  equations  match  the  Navier  Stokes  equations.  For  example,  the 
equilibrium  populations  of  equation  4.13  produce  the  following  x-momentum  equation 
(to  second-order  terms). 


d{pv,)  d{pv,v,)  d{pV,Vy] 

\  r\  \ 


dt 


dx 


dy 


d{3Cwop) 

dx 


+  V  V^(/9l4)  +  y 


d{V-{pV)) 

dx 


(4.19) 


V 


C  At 


(2r  -  1) 


p  =  2  zqv 


The  above  viscosity  terms  differ  slightly  from  the  form  presented  in  section  2.4,  where 
the  density  appears  outside  the  spatial  derivatives,  for  example, 

V  pV^Vx  and  p  p  ^  ^  (4.20) 

dx 

This  is  not  an  issue  in  subsonic  flow  because  the  terms  p  (high-order  derivatives 

of  density  p)  are  very  small  compared  to  the  other  terms. 
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4.1.3  Stability  and  accuracy 

Formulas  that  describe  the  numerical  error  of  the  LB  method  can  be  obtained  in 
principle  by  continuing  the  Chapman- Enskog  expansion  outlined  above.  In  particu¬ 
lar,  most  of  the  terms  that  differ  from  the  Navier  Stokes  equations  in  the 
expansion  are  multiplied  by  or  AC,  which  suggests  second-order  accuracy.  How¬ 
ever,  the  terms  from  the  next-order  correction  must  also  be  considered.  More¬ 
over,  one  must  also  investigate  whether  the  truncated  Chapman- Enskog  expansion 
is  adequate  to  estimate  the  leading-order  error  term.  This  fact  is  not  obvious  be¬ 
cause  the  Chapman- Enskog  expansion  is  not  simply  a  Taylor  series  expansion^  but  a 
“double”  expansion  that  involves  both  a  Taylor  series  and  another  functional  series 
expansion  described  above.  A  detailed  analysis  has  not  been  performed  yet. 

Leaving  aside  the  theoretical  difficulties,  experimental  evidence  presented  in  sec¬ 
tions  4.4  and  4.5,  shows  that  the  LB  method  is  second-order  accurate  in  space  and 
in  time.  In  the  future,  it  would  be  very  interesting  to  calculate  theoretically  the  con¬ 
stants  of  the  leading-order  error  terms  for  the  different  LB  models  (the  d2q7  above, 
and  the  d2q9  and  d2ql5  described  later),  and  to  test  whether  the  theoretical  error 
constants  agree  with  the  experimental  results. 

Stability  conditions  for  the  LB  method  are  not  known  in  general.  A  few  necessary 
conditions  are  as  follows.  First,  a  CFL  condition  for  explicit  methods  requires  that 
the  ratio  of  the  flow  speed  divided  by  the  numerical  speed  V j (Ax j At)  should  be  less 
than  one.  ^  In  addition,  a  subsonic  flow  condition  must  be  satished  that  the  ratio 
of  the  flow  speed  divided  by  the  speed  of  sound  V / should  be  less  than  one.  It 
should  be  noted  that  the  CFL  condition  applies  generally  to  all  explicit  methods  (see 
section  3.3.1),  but  the  subsonic  flow  condition  is  an  additional  requirement  of  the 
present  lattice  Boltzmann  approach. 

^The  CFL  condition  also  requires  that  the  ratio  of  the  sound  speed  divided  by  the  numerical 
speed  Cg / (Ax / At)  should  be  less  than  one.  This  is  always  true  in  the  case  of  lattice  Boltzmann 
because  Cg  =  \/3wo(Ax/At)  and  because  of  the  constrains  on  the  density  coefhcients  wo,zo  (for 
example,  see  equation  4.22). 


CHAPTER  4.  THE  LATTICE  BOLTZMANN  METHOD 


123 


Another  stability  condition  is  that  the  density  coefficients  rco,  -^o,  Vo  of  the  equilib¬ 
rium  population  formulas  must  be  positive.  ^  This  fact  can  be  proven  by  considering 
the  norm  of  the  vector  of  populations  Ay  and  by  requiring  that  the  norm  does  not 
grow  after  the  relaxation  (collision  operator)  is  applied.  However,  the  algebra  is 
rather  complicated  and  is  omitted  here.  It  is  very  easy  to  verify  experimentally  that 
non-positive  density  coefficients  Wo^zo^yo  lead  to  instabilities. 

The  requirement  for  positivity  of  the  density  coefficients  rco,  -^o,  Vo  can  be  combined 
with  other  formulas  to  deduce  further  conditions.  For  example,  in  the  case  of  the  9- 
speed  d2q9  model,  the  mass  conservation  formula  is 

4wo  +  4j/o  +  -20  =  1  (4-21) 

Using  yQ  =  Wo/ 4  gives, 

rco  <  y  (4.22) 

5 

as  an  upper  bound  on  the  coefficient  Wq.  Actually,  a  more  stringent  bound  can  be 
obtained  by  considering  the  formula  for  the  bulk  viscosity, 

y  =  2n(l  —  3rco  —  6j/o)  (4.23) 

The  second  law  of  thermodynamics  applied  to  the  dissipation  of  energy  during  the 
compression  of  fluid  elements  (Landau&Lifshitz  [32,  p.45])  requires  that 

h  >  ^  (4.24) 

which  gives 

Wo  +  2j/o  <  :^  (4.25) 

lo 

or  using  the  choice  yo  =  n)o/4, 

<  4  (4.26) 

The  above  formulas  are  necessary  conditions  for  the  stability  of  the  lattice  Boltzmann 
method. 

^The  density  coefficient  j/o  is  used  in  the  orthogonal  d2q9  model  of  section  4.3.  The  density 
coefficient  j/o  of  the  d2q9  model  should  be  preferably  chosen  j/o  =  a>o/4  following  the  ratio  of  the 
other  coefficients  such  as  yi  =  wi/A. 
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4.2  Initial  and  boundary  conditions 

Having  reviewed  the  basic  theory  of  the  LB  method,  it  is  now  appropriate  to  discuss 
how  to  implement  accurate  initial  and  boundary  conditions  for  the  LB  method.  The 
basic  idea  is  to  hud  a  good  way  of  calculating  the  populations  Fi  from  the  fluid 
variables  p^Vx^Vy. 

My  approach  is  to  combine  the  standard  collision  operator  of  the  lattice  Boltzmann 
method  with  a  new  extended  collision  operator.  This  combination  is  referred  to  as  the 
hybrid  method,  and  is  described  below.  An  alternative  approach  is  to  truncate  the 
Chapman- Enskog  expansion.  In  theory,  the  inhnite  series  of  the  Chapman- Enskog 
expansion  produces  exactly  the  inverse  mapping;  however,  in  practice  the  Chapman- 
Enskog  expansion  must  be  truncated.  Eurthermore,  the  obvious  truncation  of  the 
Chapman- Enskog  expansion  does  not  perform  very  well  (numerical  tests  of  the  zero¬ 
th  and  the  hrst-order  truncated  series  are  given  in  section  4.4).  However,  if  the  hrst- 
order  truncated  series  is  modihed  appropriately,  it  produces  an  expression  which  is 
identical  to  the  hybrid  method.  This  equivalence  of  the  hybrid  method  and  a  modihed 
Chapman- Enskog  expansion  was  hrst  noticed  by  Dominique  d’Humieres  who  kindly 
communicated  this  result  to  the  author. 

4.2.1  Previous  approaches  and  related  work 

Before  presenting  the  hybrid  method  and  the  extended  collision  operator,  it  is  useful  to 
review  how  initial  and  boundary  conditions  for  the  LB  method  have  been  traditionally 
implemented. 

Traditionally,  the  use  of  an  accurate  inverse  mapping  for  the  lattice  Boltzmann 
populations  has  been  avoided  both  for  initial  value  and  for  boundary  value  problems. 
In  the  case  of  initial  value  problems,  when  the  huid  density  and  velocity  p,  I4,K/ 
are  specihed  at  time  zero  and  the  goal  is  to  calculate  p,  I4,K/  at  later  times,  the 
populations  Fi  can  be  initialized  equal  to  the  equilibrium  values  which  are  known 
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in  terms  of  p^Vx^Vy.  The  error  that  results  from  this  approximation  can  be  overcome 
by  discarding  the  hrst  few  steps  and  measuring  the  parameters  of  the  flow  afterwards 
(recalibrating  the  solution).  This  is  often  done  in  the  literature  without  further 
discussion.  The  problem  with  recalibration  is  that  a  slightly  different  problem  is 
solved  than  the  original  p^Vx^Vy.  By  contrast,  traditional  methods  such  as  hnite 
differences  do  not  need  any  recalibration.  Thus,  to  put  the  lattice  Boltzmann  method 
on  equal  footing  with  other  methods  (for  numerical  testing  in  particular)  it  is  desirable 
to  have  an  accurate  means  of  calculating  the  populations  Fi  from  the  initial  values  of 
P}  Vx,  Vy. 

In  the  case  of  boundary  conditions,  there  are  techniques  that  avoid  the  inverse 
mapping  as  in  the  case  of  initial  conditions.  In  particular,  the  velocity  of  the  fluid  can 
be  forced  to  zero  at  non-slip  wall  boundaries  by  imposing  a  non-slip  bounce-back  of 
the  populations  Fi.  However,  the  location  of  the  wall  is  not  always  well  dehned  (see 
Cornubert&et  ah  [12],  Ginzbourg&Adler  [21]  for  a  discussion  of  the  actual  location 
of  the  wall  as  a  function  of  the  simulation  parameters  for  some  simple  flows).  In  the 
case  of  boundary  conditions  with  non-zero  velocity,  such  as  the  driven  cavity  problem 
Peyret&Taylor  [38,  p.l99],  the  velocity  at  the  boundary  can  be  controlled  by  inserting 
momentum  (forcing)  in  every  step  as  is  done  in  lattice  gas  automata.  This  type  of 
forcing  is  somewhat  ad-hoc  however,  and  is  often  inaccurate,  and  requires  recalibra¬ 
tion  of  the  simulation  parameters.  In  the  case  of  an  arbitrary  velocity  specihcation 
at  the  boundary,  such  as  the  fluid  flows  of  section  4.5,  the  forcing  techniques  and  the 
recalibration  become  very  difficult.  Thus,  it  is  desirable  to  have  an  accurate  means  of 
calculating  the  populations  Fi  at  a  boundary  node  from  the  fluid  variables  p^Vx^Vy 
that  are  specihed  at  this  node. 

4.2.2  Hybrid  method  and  extended  collision  operator 

The  calculation  of  the  populations  Fi  from  fluid  variables  p^Vx^Vy  is  now  described. 
For  this  purpose,  an  extended  collision  operator  is  introduced  (denoted  d2q7X)  which 
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differs  from  the  standard  collision  operator  in  the  equilibrium  population  formulas. 
The  evolution  equation  remains  as  before,  and  can  be  written  as  follows, 

^  A  A  \  \  FAx Nl)  —  F Nl) 

Fi{x  +  Ci  At,  t  +  At)  =  Fi{x,  f)  H - ^ -  (4.27) 

The  relaxation  parameter  of  the  extended  collision  operator  is  denoted  r*  to  dis¬ 
tinguish  it  from  the  relaxation  parameter  r  of  the  standard  collision  operator  (and 
accordingly  for  other  parameters  shown  below). 

The  important  idea  is  that  the  equilibrium  population  formulas  of  the  extended 

collision  operator  include  additional  terms  (shown  below)  so  that  the  viscosity  can  be 
controlled  independently  from  the  relaxation  parameter  t*  .  Thus,  t*  can  be  set  equal 
to  one,  which  implies  that  the  Fi  are  replaced  by  the  at  each  step.  In  other  words, 
the  old  Fi  are  not  needed  anymore,  and  the  provide  a  direct  mapping  from  the 

flow  variables  p,Vx,Vy  to  the  new  populations  Fi. 

The  extended  collision  operator  is  used  everywhere  (all  the  fluid  nodes)  at  startup, 
but  only  at  the  boundary  nodes  during  the  simulation.  After  the  first  step,  the  stan¬ 
dard  collision  operator  is  used  at  the  inner  (non-boundary)  nodes.  This  combination 
of  the  two  operators  is  referred  to  as  the  hybrid  method  here  (denoted  d2q7H  in  the 
case  of  the  hexagonal  model).  It  is  valid  to  combine  two  different  collision  operators 
as  long  as  the  two  operators  have  the  same  transport  coefficients  (shear  and  bulk 
viscosity)  which  is  true  here. 

The  equilibrium  population  formulas  of  the  extended  collision  operator  in¬ 

clude  terms  which  are  based  on  the  gradients  of  the  fluid  velocity,  and  are  motivated 
by  equation  2.5.1  of  Wolfram  [58].  The  equilibrium  population  formulas  are  as 
follows, 

Ff'fxf)  =  p{x,t)  [rco  +  wfC  ■  V)  +  W2o{C  ■  ff)(e*  •  V)  +  W2i{V  ■  ff)]  + 

W31  [C  ■  ■  pV))  +  W32  (V  •  pV)  , 

z  =  1, . . .  ,6 

Ff^fxf)  =  p{xf)\zi2TZ2l{V-V)]+Z32{N-pV) 


(4.28) 
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3  C  W31  +  6  W32  +  Z32  =  0 


The  velocity  gradients  in  the  above  equation  (the  terms  with  coefficients  rcsi,  W32,  Z32) 
are  computed  using  hnite  differences  unless  they  are  known  by  other  means;  for 
example  some  of  the  velocity  gradients  may  be  known  at  the  boundary  nodes  (see 
section  4.5).  The  coefficients  Wo,Wi,W2o,W2\,  Zq,  Z2\  have  the  same  values  as  in  the 
standard  collision  operator  d2q7  (equation  4.13).  It  is  worth  noting  that  the  velocity 
gradient  terms  of  equation  4.28  can  be  viewed  as  a  correction  to  the  equilibrium 
population  formulas, 

(4.29) 

where 

=  W31  {C  •  v(e,  •  pV))  +  W32  (V  •  pV)  (4.30) 


The  above  formula  is  used  in  the  next  section  to  relate  the  extended  collision  operator 
to  a  truncated  Chapman- Enskog  expansion. 

Using  the  Chapman- Enskog  expansion,  the  shear  and  bulk  viscosities  of  the  ex¬ 
tended  collision  operator  can  be  calculated. 


^  C  At  ,  ^  ,  ?>  C  W31 

V*  =  - (2t*  -  1) - ^ 


(4.31) 


/■  = 


C  At  .  ^  ,  hC  W31  , 

— —  (2t*-1)zo - ^  -  ‘I>C^W32 


When  T*  is  set  equal  to  one,  the  coefficient  W31  is  chosen  to  achieve  the  desired  shear 
viscosity  given  the  discretization  parameters  Ax,  At.  The  coefficient  W32  is  chosen 
to  achieve  the  desired  bulk  viscosity,  and  the  coefficient  Z32  is  chosen  to  enforce  the 
relation  (3c^  rcsi  +  6  rc32  +  Z32)  =  0  which  corresponds  to  mass  conservation. 

In  the  case  of  the  hybrid  method  (when  the  standard  and  extended  collision  oper¬ 
ators  are  used  in  the  same  computation),  the  bulk  viscosity  of  equation  4.31  is  chosen 
equal  to  the  bulk  viscosity  of  the  standard  collision  operator  given  by  equation  4.19 
(similarly  for  the  shear  viscosity).  Also,  the  relaxation  parameter  r*  is  set  equal  to 
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1.0.  In  this  case,  the  coefficients  tcsi ,  tC32 ,  ^^32  simplify  as  follows, 


W31 


(1  —  T)At 


W32  =  -t«o(l  -  T)At 


Z32  =  -Zo{l  -  T)At 


(4.32) 


It  should  be  noted  that  the  extended  collision  operator  is  accurate  when  used  for 
initial  and  boundary  conditions,  but  it  is  not  accurate  when  iterated  many  times.  It 
appears  that  the  hnite  differences  which  are  used  by  the  extended  collision  operator 
produce  an  error  in  viscosity  which  means  that  the  computed  solution  decays  at  a 
slightly  different  rate  than  desired.  The  error  accumulates  with  successive  iterations, 
and  the  method  does  not  approximate  the  solution  as  At  goes  to  zero  (see  hgure  4-6 
in  section  4.4.2).  However,  this  is  not  a  problem  in  practice  because  the  extended 
collision  operator  is  only  used  at  startup  and  subsequently  only  at  the  boundary 
nodes. 

Finally,  another  issue  worth  mentioning  is  the  initialization  of  the  density  at 
startup.  Quite  often,  the  pressure  P[x^y)  is  specihed  at  startup.  Then,  the  den¬ 
sity  p[x^y)  must  be  computed  from  the  pressure, 

p{x,y)  =  <p>  +{^)P{x,y)  (4.33) 

where  is  the  speed  of  sound,  <  p>  is  the  constant  average  density,  and  P  is  the 
pressure  (with  the  constant  average  pressure  subtracted  so  that  <P>  =  0).  It  is 
very  important  not  to  initialize  the  density  to  be  constant.  The  density  must  follow 
the  initial  pressure  gradients  according  to  equation  4.33;  otherwise  large  density  waves 
and  error  transients  may  result.  Once  the  density  and  velocity  p,  I4,  K,  specihed 
correctly,  the  populations  Fi  can  be  calculated  from  p^Vx^Vy  using  the  extended 
collision  operator  described  above. 
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4.2.3  Truncated  Chapman- Enskog  expansion 

An  alternative  way  of  deriving  the  hybrid  method  is  to  employ  a  truncated  Chapman- 
Enskog  expansion  and  to  perform  additional  manipulations.  Below,  the  zero-order 
and  hrst-order  truncated  Chapman- Enskog  expansions  are  described,  and  then  it  is 
shown  how  to  modify  and  simplify  the  hrst-order  expansion  in  order  to  obtain  the 
hybrid  method. 

The  zero-order  expansion,  denoted  by  d2q7E0,  approximates  the  populations  Fi 
with  the  equilibrium  value  As  stated  earlier,  this  approximation  is  used  very 

often  in  the  literature,  and  it  is  accompanied  by  recalibration  of  the  solution  after  the 
hrst  few  steps  are  discarded  (initial  transients).  The  zero-order  expansion  is  tested 
experimentally  in  section  4.4;  however,  recalibration  is  not  performed  there  because 
the  goal  is  to  compare  the  accuracy  of  calculating  the  populations  Fi  from  the  huid 
variables  p^Vx^Vy. 

The  hrst-order  expansion,  denoted  by  d2q7El,  approximates  the  populations  Fi 
with  the  Chapman- Enskog  expansion  truncated  to  hrst-order. 


F,  =  Fr  + 

'dFT^ 


-T  At 


dt 


+  C  •  VFi 


eq 


(4.34) 


A  differentiation  of  the  equilibrium  population  formulas  (equation  4.13)  provides  for¬ 
mulas  for  the  derivatives  of  FC^  in  terms  of  the  derivatives  of  the  huid  variables 
P^Vx^yy  The  derivatives  of  p^Vx^Vy  are  known  in  some  cases  (for  example  in  exactly 
solvable  huid  how  problems),  but  in  general  the  derivatives  must  be  estimated  us¬ 
ing  hnite  differences.  The  initialization  tests  of  section  4.4  employ  hnite  differences. 
In  particular,  the  time  derivatives  of  p,  14,  K,  estimated  using  the  Navier  Stokes 
momentum  and  continuity  equations,  and  the  spatial  derivatives  of  p,  14,14 
timated  using  spatial  hnite  differences.  I  have  also  tested  the  different  initialization 
methods  using  the  exact  values  of  the  derivatives,  and  the  results  are  qualitatively 
the  same  as  those  reported  in  section  4.4. 
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In  section  4.4,  it  is  shown  that  both  d2q7F0  and  d2q7Fl  produce  signihcant  errors 
in  initialization.  It  is  a  little  surprising  that  the  hrst-order  Chapman- Fnskog  correc¬ 
tion  does  not  perform  well,  but  there  is  an  easy  explanation.  We  observe  that  the 
correction  term  of  equation  4.34  does  not  conserve  momentum.  This  means  that 
the  velocity  held  that  results  from  equation  4.34  is  different  from  the  original  velocity 
held.  The  conservation  relations  that  correspond  to  equation  4.34  are  as  follows. 


Y-  F,(i)  .  _  3(pV)  d{p%V.)  ,  d{p%V,) 

^  ^  Qf  ^  Qy 


(4.35) 


and  a  similar  equation  for  Z)*- .  Therefore,  mass  is  conserved  via  the  macro¬ 
scopic  continuity  equation,  but  momentum  is  not  conserved.  On  the  other  hand,  the 
above  equations  suggest  an  easy  way  to  hx  the  problem:  We  simply  add  a  viscosity 
Laplacian  term  so  that  momentum  will  be  conserved  via  the  Navier  Stokes  momen¬ 
tum  equation.  The  new  (modihed)  Chapman- Fnskog  correction  term,  denoted  by 
is  as  follows 


pirn  =  -r  At 


dt 


+  C  •  VFr  +  (-^/(3c^))  ■  V) 


(4.36) 


In  the  numerical  tests  of  section  4.4,  the  above  equation  is  referred  to  as  d2q7FlM. 
The  numerical  tests  show  that  d2q7FlM  is  very  accurate  for  initialization  purposes. 
In  practice  however,  the  d2q7FlM  method  is  rather  cumbersome  to  apply  because 
it  requires  the  calculation  of  many  derivatives,  including  a  time  derivative  and  a 
Laplacian  term. 

Fortunately,  equation  4.36  can  be  simplihed  greatly  by  neglecting  second-order 
terms  in  the  Mach  number.  This  means  that  only  terms  up  to  hrst  order  in  {V j c) 
are  kept  in  the  Chapman- Fnskog  expansion,  and  terms  proportional  to  iV j are 


■^The  addition  of  a  viscosity  Laplacian  term  to  the  first-order  Chapman-Enskog  expansion  (for  the 
purpose  of  conserving  momentum)  does  not  change  the  derivation  of  the  Navier  Stokes  equations  via 
the  Chapman-Enskog  procedure  because  the  corresponding  corrections  are  higher-order  derivatives 
than  the  Navier  Stokes  equations. 
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discarded  because  they  are  small.  In  addition,  the  time  derivatives  are  replaced  by 
space  derivatives  using  the  macroscopic  mass  and  momentum  equations.  Examples 
of  this  kind  of  expansion  can  be  found  in  Frisch  [20]  and  d’Humieres  [15].  Thus, 
equation  4.36  simplihes  to, 

F/i")  =  -T  At  •  V(e,  •  pV)  -  wo{V  •  pV)]  (4.37) 

.3c"^ 

and  similarly  for  the  rest  particle  population, 

= -T  At  [-zo{V  ■  pV)]  (4.38) 

In  section  4.4,  it  is  shown  that  the  simplihed  equation  4.37  is  as  accurate  as  the 
original  equation  4.36  for  initialization  purposes. 

The  above  formulas  look  suspiciously  similar  to  the  hybrid  method  that  was  de¬ 
scribed  in  the  last  section.  In  fact,  it  is  easy  to  verify  that  equations  4.37  and  4.38 
produce  identical  results  with  the  hybrid  method.  If  equation  4.37  is  used  to  initialize 
the  populations  Fi  as  Fi  =  FC^  +  and  the  hrst  relaxation  step  is  performed, 

then  the  resulting  populations  which  are  advected  (denoted  Fi)  are  as  follows, 

R=  [Fr  +  +  (-1/t)F/is)  (4.39) 

R=  Fr  +  (1  -  r)At  [^(e,  •  V(e,  •  pV))  -  wo{V  •  pE)]  (4.40) 

The  above  populations  are  identical  to  the  populations  that  are  advected  after  a  relax¬ 
ation  step  using  the  extended  collision  operator  (equation  4.29)  when  the  simplihed 
values  of  tcsi ,  ^>32 ,  ^^32  for  the  hybrid  method  are  used  (equation  4.32).  This  shows 
that  the  simplihed  truncated  hrst-order  Chapman- Enskog  expansion  is  equivalent  to 
the  extended  collision  operator. 

In  the  next  section,  LB  models  are  described  which  are  appropriate  for  orthog¬ 
onal  grids  in  two  and  in  three  dimensions.  The  results  of  this  section  are  applied 
straightforwardly  to  the  orthogonal  LB  models. 
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4.3  Lattice  Boltzmann  for  orthogonal  grids 

The  ideas  discussed  in  the  previous  sections  using  the  hexagonal  7-speed  model  can 
be  applied  straightforwardly  to  other  lattice  Boltzmann  models.  Here,  the  orthogonal 
9-speed  model  in  two  dimensions  is  described. 


4.3.1  Two-dimensional  9-speed  model  (d2q9) 

The  orthogonal  9-speed  model  is  abbreviated  by  the  symbol  d2q9  following  the 
convention  of  Qian  [41].  An  orthogonal  lattice  (see  hgure  4-1)  with  nine  popu¬ 
lations  at  each  node  is  used.  The  population  Fq  is  non-moving,  the  populations 
z  =  2,  4,  6,  8  move  along  the  diagonal  directions  at  the  speed  a/2c,  and  the  popu¬ 
lations  F/  i  =  1, 3,  5,  7  move  along  the  vertical  and  horizontal  directions  at  the  speed 
c  =  Ax  I  At.  The  relaxation  and  advection  steps  are  given  by  the  following  formulas. 


F,{x  +  CAt,t  +  At)  =  F,{xO)  +  {-1/t)  [F,{xO)  -  FC'^ixO)] 

Fo{x,t  +  At)  =  Fo(T,t)  +  (-l/r)  [Fo(T,t)  -  Fo®'^(T,t)]  (4-41) 

z  =  1,...,8 


1  3Af  V 

T  = - 1 - 

2  Ax2 

The  relaxation  parameter  r  is  chosen  to  achieve  the  desired  kinematic  viscosity  p 
given  the  space  and  time  discretization  parameters  Ax,  At.  The  vector  C  stands  for 
the  eight  velocity  directions  of  the  orthogonal  (square)  lattice. 


e* 


Ax 

~At 


27r(z  —  1) 

cos - 

8 


.  27r(z  —  1) 

sm - 

8 


(4.42) 


The  velocity  V{xO)  and  density  /o(x,  f)  are  computed  from  the  populations  Fi[xO) 
using  the  relations. 


p(T,t)  =  T!l=oF,{x,t) 

p{xO)V{xO)  =  Tn=iF,{xO)C 


(4.43) 


CHAPTER  4.  THE  LATTICE  BOLTZMANN  METHOD 


133 


The  variations  of  density  around  its  mean  value  (spatial  mean  which  is  constant  in 
time)  provide  an  estimate  of  the  fluid  pressure  P(T,  f),  according  to  the  following 
equation, 

P{xO)  =  cl  {p{xO)  —  <  p>  )  .  (4.44) 

The  speed  of  sound  is, 


Cs  =  J {2  Wo  +  4  yo)  {Ax / At) 


(4.45) 


where  the  coefficients  ^Co,^/o  are  discussed  below.  The  FC^{xO)  equilibrium  popula¬ 
tions  are  given  by  the  following  equations, 

=  P  [vo  +  yi{ei  ■  y)  +  J/2o(ei  •  V){ei  ■  V)  +  J/2i(l^  •  1^)] 

=  p  ^wo  +  wi{C-V) +  W2o{C■V){e^■V) +  w2i{V  ■V)'^  (4.46) 

Fo^^  =  p[zo  +  Z2i{V-V)] 


4  rco  +  4  j/o  +  ^0  —  1  5 

j/i  =  l/(12c2)  ,  j/20  =  l/(8c‘^)  ,  j/21  = -1/(24 c^) 
wi  =  l/{3c^)  ,  W2o  =  ll{2c^)  ,  n)2i  = -l/(6c2) 

2;2i  =  — 2/(3c^)  ,  c  =  Ax ! At 

The  coefficient  yo  is  chosen  yo  =  (1/4)  rco  for  simplicity.  The  coefficient  Wo  can 
be  varied  to  adjust  the  speed  of  sound  and  the  bulk  viscosity  within  the  stability 
constraints  rco  >  0  and  Zo  >  0.  The  shear  and  bulk  viscosity  of  the  d2q9  collision 
operator  have  the  following  values  (calculated  using  the  Chapman- Enskog  procedure). 


V  = 


=2  At 


(27 


1) 


(4.47) 


p  = 


1)  (1  -  3rco  -  6  j/o) 


The  extended  collision  operator  for  the  orthogonal  9-speed  model  (d2q9X)  is  derived 
similarly  to  the  hexagonal  model  of  section  4.2.2.  Two  additional  terms  based  on 
gradients  of  the  fluid  velocity  are  included  in  the  equilibrium  population  formulas. 
Everything  else,  including  all  the  coefficients  rci,  j/i,  7C20, ...  of  the  standard  collision 
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operator  d2q9  remain  the  same.  The  equilibrium  population  formulas  for  d2q9X  are 
as  follows, 

=  P  [?/o  +  yii^i  ■  V)  +  j/2o(ei  •  V){ei  ■  V)  +  y2i{V  ■  y)|  + 

2/31  (e*  •  V(e*  •  pV))  +  1/32  (V  •  pV) 

=  p\w,T  wr{C  •  V)  +  W2,{C  •  V){C  •  V)  +  W2r{V  •  f/)]  +  (4.48) 

W31  [C  ■  v(e*  •  pV))  +  W32  (V  •  pV) 

Fo*^^  =  p[zo  +  Z2l{V  •  y)]  +  ^32  (V  •  pV) 

2  C  W31  +  4  W32  +  4  2/31  +  4 1/32  +  -232  =  0  (4.49) 

2/31  =  ^«3i/4  (4.50) 

Equation  4.49  is  necessary  for  mass  conservation  and  can  be  used  to  determine  the 
coefficient  2;32.  Equation  4.50  is  necessary  to  remove  an  unwanted  (anisotropic)  mo¬ 
mentum  diffusion  term  in  the  Chapman- Enskog  expansion.  The  velocity  gradients  of 
the  extended  collision  operator  must  be  computed  using  hnite  differences  unless  they 
are  known  by  other  means. 

The  shear  and  bulk  viscosities  of  the  d2q9X  operator  have  the  following  values 
(calculated  using  the  Chapman- Enskog  procedure), 

A  Af 

=  — (2t*-1)  -  c^u;3i  (4.51) 

D 

C  At 

/i*  =  — ^  (2t*  -  1)  (1  -  3u;o  -  62/0)  -  2c‘^u;3i  -  2  c^(u;32  +  2 1/32) 

The  parameter  1/32  is  chosen  1/32  =  W321 4  for  simplicity.  Once  the  relaxation  parameter 
T*  is  set  equal  to  one,  the  coefficient  u;3i  is  chosen  to  achieve  the  desired  kinematic 
viscosity  given  the  discretization  parameters  Ax,  At.  The  coefficient  W32  is  chosen 
to  achieve  the  desired  bulk  viscosity.  In  the  case  of  the  hybrid  method  d2q9H,  the 
bulk  viscosity  of  equation  4.51  is  chosen  equal  to  the  bulk  viscosity  of  the  standard 
collision  operator  given  by  equation  4.47. 
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4.3.2  Three-dimensional  15-speed  model  (d3ql5) 

The  orthogonal  15-speed  model  is  abbreviated  by  the  symbol  d3ql5  following  the 
convention  of  Qian  [41],  A  3-dimensional  cubic  lattice  with  15  populations  at  each 
node  is  used  as  shown  in  hgure  4-3.  The  populations  z  =  7,  8,  9, 10, 11, 12, 13, 14 
move  along  the  diagonal  directions  at  the  speed  a/2c,  and  the  populations  F/  i  = 
1,2,  3,  4,  5,  6  move  along  the  non-diagonal  directions  at  the  speed  c  =  AxjAt.  The 
non-moving  population  is  Fq.  The  relaxation  and  advection  steps  are  given  by  the 
following  formulas. 


F,{x  +  CAt,t  +  At)  =  F,{xO)  +  {-1/t)  [F,{xO)  -  FC'^ixO)] 

Fo{x,t  +  At)  =  Fo(T,t)  +  (-l/r)  [Fo(T,t)  -  Fo®'^(T,t)]  (4.52) 

z  =  1,...,8 

1  3Af  V 

T  = - 1 - 

2  Ax2 

The  relaxation  parameter  r  is  chosen  to  achieve  the  desired  kinematic  viscosity  n 
given  the  space  and  time  discretization  parameters  Ax,  At.  The  vector  C  stands  for 
the  14  velocity  directions  of  the  3-dimensional  cubic  lattice,  as  shown  in  hgure  4-3. 

The  velocity  V (x,  t)  and  density  p(x,  t)  are  computed  from  the  populations  ih(x,  t) 
using  the  relations, 

p{x,t)  = 

p{xO)V{xO)  =  Tn  =  iF,{xO)C 

The  variations  of  density  around  its  mean  value  (spatial  mean  which  is  constant  in 
time)  provide  an  estimate  of  the  huid  pressure  P(x,  f),  according  to  the  following 
equation, 

P[x,t)  =  cl  {p{x,t)- <p>  )  .  (4.54) 

The  speed  of  sound  is. 


Cs  = 


\J{2wo  +  8z/o)  (Ax /At) 


(4.55) 
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Figure  4-3:  Velocity  directions  for  lattice  Boltzmann  d3ql5  in  three-dimensions. 

where  the  coefficients  rco,  Vo  are  discussed  below.  The  equilibrium  populations  t) 

are  given  by  the  following  equations, 

=  p  [j/o  +  yi{&i  ■  y)  +  J/2o(ei  •  V){ei  ■  V)  +  J/2i(V  •  V)| 

=  p  ^wo  +  wi{C-V) +  W2o{C■V){e^■V) +  w2i{V -V)^  (4.56) 

=  P  [^o  +  ^2i(V- V)] 

6  rco  +  8  j/o  +  ^0  =  1  5 

Pi  =  l/(12c2)  ,  j/20  =  1/(16 c'^)  ,  j/21  = -1/(48 c^) 

wi  =  l/{3C)  ,  W2o  =  1/{2C)  ,  w2i  =  -1/{QC) 

Z21  =  — l/(3c^)  ,  c  =  Ax  I  At 

The  coefficient  j/o  is  chosen  j/o  =  (1/8)  u^o  for  simplicity.  The  coefficient  Wq  can 
be  varied  to  adjust  the  speed  of  sound  and  the  bulk  viscosity  within  the  stability 
constraints  rco  >  0  and  Zq  >  0.  The  shear  and  bulk  viscosity  of  the  d3ql5  collision 
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operator  have  the  following  values  (calculated  using  the  Chapman- Enskog  procedure), 

C  At  .  .  ,  , 

u  =  — —  2t-1  4.57 

D 

C  At  .  .  .  , 

^  (2t  -  1)  (1  -  3  rco  -  12  j/o) 

The  extended  collision  operator  (d3ql5X)  for  the  orthogonal  15-speed  model  is  derived 
similarly  to  the  hexagonal  model  of  section  4.1.1.  Two  additional  terms  based  on 
gradients  of  the  fluid  velocity  are  included  in  the  equilibrium  population  formulas. 
Everything  else,  including  all  the  coefficients  rci,  j/i,  r/;20, ...  of  the  standard  collision 
operator  d3ql5  remain  the  same.  The  equilibrium  population  formulas  for  d3ql5X 
are  as  follows, 

=  d  [?/o  +  yi{&i  ■  y)  +  J/2o(ei  •  E)(C  •  V)  +  y2i{V  ■  E)]  + 

2/31  (e*  •  V(e;-  •  pV))  +  j/32  (V  •  pV) 

F-  F  =  p  ^wo  +  wi{C  ■  V)  +  W2o{C  ■  V){et  ■  V)  +  W2i{V  ■  V)\  +  (4.58) 

W31  (C  •  V(e;-  •  pV))  +  W32  (V  •  pV) 

=  p[zo  +  Z2l{V  •  V)]  +  ZS2  (V  •  pV) 

2Cw3i  +  611)32  +  8  1/31  +  8  1/32  +  1)32  =  0  (4.59) 

2/31  =  W31/8  (4.60) 

Equation  4.59  is  necessary  for  mass  conservation  and  can  be  used  to  determine  the 
coefficient  Z32.  Equation  4.60  is  necessary  to  remove  an  unwanted  (anisotropic)  mo¬ 
mentum  diffusion  term  in  the  Chapman- Enskog  expansion. 

The  shear  and  bulk  viscosity  of  the  d3ql5X  operator  have  the  following  values 
(calculated  using  the  Chapman- Enskog  procedure). 

At  ,,  ,  ,  , 

n  =  — —  (2t-1)  -  11)31  (4.61) 

6 

C  At 

(2t  -  1)  (1  -  3 11)0  -  122/0)  -  2Cw3i  -  0^(211)32  +  82/32) 

The  coefficient  1/32  is  chosen  1/32  =  11)32/8  for  simplicity.  Once  the  relaxation  parameter 
T  is  set  equal  to  one,  the  coefficient  11)31  is  chosen  to  achieve  the  desired  kinematic 
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Fig.  (a) 


Fig.  (b) 


Figure  4-4:  The  velocity  field  of  the  hexagonal  Taylor  vortex  and  the  hexagonal  shear 
flow  are  shown  in  figures  (a)  and  (b)  respectively.  Both  flows  have  periodic  boundary 
conditions. 

viscosity  ly  given  the  discretization  parameters  Ax,  At.  The  coefficient  W32  is  chosen 
to  achieve  the  desired  bulk  viscosity.  In  the  case  of  the  hybrid  method  d3ql5H,  the 
bulk  viscosity  of  equation  4.61  is  chosen  equal  to  the  bulk  viscosity  of  the  standard 
collision  operator  given  by  equation  4.57. 

The  following  two  sections  present  experimental  evidence  regarding  the  accuracy 
of  the  hexagonal  d2q7  and  the  orthogonal  d2q9  models  in  initial  and  in  boundary 
value  problems.  Experimental  results  for  the  three-dimensional  d3ql5  model  are  not 
presented  here.  However,  the  algorithm  presented  above  (both  d3ql5  and  d3ql5X) 
has  been  tested  on  simple  flows,  and  appears  to  work  correctly.  The  accuracy  of  the 
d3ql5  model  is  expected  to  be  comparable  to  the  accuracy  of  the  d2q9  model. 


4.4  Experiments  —  initial  value 
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First,  initial  value  problems  are  tested.  For  this  purpose,  the  analytic  solutions  of 
a  decaying  Taylor  vortex  and  a  decaying  shear  flow  are  used.  These  flows  are  two- 
dimensional  and  have  periodic  boundary  conditions.  Figure  4-4  shows  the  velocity 
vector  helds  of  the  flows.  The  decaying  Taylor  vortex  (G.I.  Taylorl923  [51])  has  the 
following  analytic  solution, 

Vx{x,yO)  =  (  —  1/A)  cos(Aa;)  sin[By)  exp[—2a  pt) 

Vy{x,yO)  =  {1/ B)  sm{Ax)  cos{By)  exp{—2a  pt)  (4.62) 

P{x,yO)  =  —(1/4)  [cos(2Aa;)/A^  +  cos{2By) / B"^]  exp{—4apt) 

where  the  constant  a  is  equal  to  (A^  +  i?^)/2,  and  p  is  the  kinematic  viscosity.  The 
length  constants  A,  i?  are  chosen  A  =  1  and  B  =  2/\/3  to  produce  the  hexagonal 
Taylor  vortex,  and  A  =  i?  =  1  to  produce  the  orthogonal  Taylor  vortex.  The 
former  is  used  to  test  the  hexagonal  7-speed  model,  and  the  latter  is  used  to  test 
the  orthogonal  9-speed  model.  The  flow  region  of  the  hexagonal  Taylor  vortex  is 
0  <=  X  <=  27r  and  0  <=  j/  <=  7r\/3,  and  can  be  covered  exactly  by  a  hexagonal 
lattice  using  periodic  boundary  conditions.  Similarly,  the  flow  region  of  the  orthogonal 
Taylor  vortex  is  0  <=  x  <=  27r  and  0  <=  y  <=  27r,  and  can  be  covered  exactly  by 
an  orthogonal  lattice  using  periodic  boundary  conditions. 

The  decaying  shear  flow  has  the  following  analytic  solution, 

V^{x,yO)  =  A 

Vy(x,y,t)  =  B  cos[k  X  —  k  At)  ex])[  —  P  p  t)  (4.63) 

P[x^yO)  =  constant 

where  the  constant  k  is  chosen  A;  =  1  so  that  x  varies  between  0  <=  x  <=  27r,  and 
the  length  constants  A,  B  are  chosen  A  =  i?  =  1  so  that  the  horizontal  velocity  is 
equal  to  the  maximum  vertical  velocity.  The  vertical  extent  of  the  shear  flow  is  chosen 
0  <=  y  <=  tta/S  for  the  hexagonal  case,  and  0  <=  y  <=  27r  for  the  orthogonal  case 
in  complete  analogy  with  the  Taylor  vortex. 

In  all  of  the  results  reported  below,  the  coefficient  of  shear  viscosity  is  chosen 
equal  to  one,  p  =  1.  The  measured  error  denotes  the  velocity  relative  error,  and 
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is  calculated  according  to  the  following  formula, 


yE 


E.,|v;<|  E..jr;| 


(4.64) 


where  V*  denotes  the  exact  analytic  solution,  and  the  sums  are  taken  over  the  whole 
grid.  In  the  case  of  the  Hagen-Poiseuille  flow  and  the  oscillating  plate  problem  (see 
section  4.5)  where  Y,x,y  1^*1  =  0?  ^6  ^  different  normalization  as  follows. 


yE 


J2x,y  IK 


K*l  +  J2x,y  IK 

E.,,  IK1 


(4.65) 


Double-precision  arithmetic  is  used  in  all  of  the  reported  results  unless  stated  other¬ 
wise  (for  example  in  hgure  4-13). 

The  Mach  number  M  is  dehned  using  the  maximum  fluid  speed  at  time  zero, 
which  is  equal  to  1.0  for  all  the  test  cases. 


M  =  1/cs  =  At  /  {Ax\/3wo) 


(4.66) 


Also,  the  pseudo-Mach  number  or  “computational  Mach  number”  Me  is  dehned. 


Me  =  1/c  =  At  I  Ax 


(4.67) 


Below,  Me  is  used  in  the  hgures  rather  than  M  because  the  discretization  error  of  the 
lattice  Boltzmann  method  depends  on  Me  rather  than  M  as  we  will  see  below.  In 
the  case  of  the  Taylor  vortex,  which  is  a  solution  of  the  incompressible  Navier  Stokes 
equations,  the  compressible  effects  are  kept  smaller  than  the  discretization  error  by 
choosing  Wq  =  1/7.  Both  the  compressible  effects  and  the  discretization  error  decrease 
quadratically  with  Me,  and  the  choice  Wq  =  Ijl  keeps  the  compressible  effects  smaller 
than  the  discretization  error  in  the  Taylor  vortex  at  least  (see  section  4.4.4).  In  the 
case  of  shear  flow,  which  has  zero  density  gradient  and  is  a  solution  of  the  compressible 
Navier  Stokes  equations,  the  error  is  independent  of  the  Mach  number  M  and  it 
depends  only  on  Me- 

For  the  hexagonal  7-speed  model,  the  choice  Wq  =  1/7  produces  a  Mach  number 
that  satishes  the  relation  M  =  (1.53  Me)  =  (1.53Af/Aa;).  For  the  orthogonal  9-speed 
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model,  the  choice  j/o  =  Wo/4  and  Wq  =  1/7  produces  M  =  (1.53  Me)  also.  Another 
choice  Wq  =  10“®/3  is  discussed  briefly  in  section  4.4.3  for  the  purpose  of  allowing 
high  Mach  numbers  with  small  Me  in  particular  M  =  (10^  Me).  We  also  note  that 
different  values  of  Wq  are  used  in  section  4.4.4  for  the  purpose  of  examining  the  error 
of  the  lattice  Boltzmann  method  as  a  function  of  At  while  keeping  the  Mach  number 
constant.  In  particular,  the  Mach  number  is  kept  constant  by  varying  Wq  in  proportion 
to  NC  (see  equation  4.68).  This  study  allows  us  to  distinguish  between  compressible 
effects  and  the  discretization  error  of  the  lattice  Boltzmann  method. 

4.4.1  Initialization  error 

This  section  compares  the  different  methods  of  initialization  which  are  described  in 
section  4.2,  and  are  denoted  by  d2q7F0,  d2q7Fl,  d2q7FlM,  and  d2q7H.  We  recall 
that  the  simplihed  hrst-order  Chapman- Fnskog  expansion  (equation  4.37,  4.38)  is 
identical  to  the  hybrid  method  d2q7H,  and  thus  there  is  no  need  to  test  it  separately. 
Figure  4-5  plots  the  error  during  the  hrst  10  steps  of  the  simulation.  A  30  X  30  grid  is 
used  (Ax  =  27r/30  =  0.2094).  Figure  (a)  plots  the  error  in  the  case  of  the  hexagonal 
Taylor  vortex,  using  At  =  0.001  which  gives  r  =  0.5912  for  the  standard  collision 
operator.  The  curves  shown  correspond  to  d2q7F0,  d2q7Fl,  d2q7FlM,  d2q7H  (solid, 
dashed,  dotted,  dash-dotted  lines).  Figure  (b)  plots  the  same  data  using  At  =  0.025 
which  gives  r  =  2.780  for  the  standard  collision  operator.  We  can  see  that  the  hrst- 
order  momentum-conserving  Chapman- Fnskog  expansion  d2q7FlM  and  the  hybrid 
method  d2q7H  produce  very  similar  results,  and  they  are  are  the  most  accurate  in 
all  cases.  We  can  also  see  that  the  hrst-order  Chapman- Fnskog  expansion  d2q7Fl 
that  does  not  conserve  momentum  is  more  accurate  than  the  zero-order  expansion 
d2q7F0  when  r  <  1  and  inversely  when  r  >  1.  Figures  (c)  and  (d)  plot  the  same 
data  as  hgures  (a)  and  (b)  for  the  case  of  shear  how.  The  results  are  qualitatively 
the  same.  The  experiments  demonstrate  that  the  hybrid  method  can  be  used  to 
initialize  accurately  the  populations  Fi  from  the  huid  variables  p,  14,  F/  in  an  initial 
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Figure  4-5:  The  four  initialization  methods  d2q7F0,  d2q7Fl,  d2q7FlM,  d2q7H  (solid, 
dashed,  dotted,  dash-dotted  lines)  are  compared  using  a  30  X  30  grid  and  periodic 
boundary  conditions.  Figures  (a)  and  (b)  plot  the  error  in  simulating  the  hexagonal 
Taylor  vortex  using  At  =  0.001  and  At  =  0.025  respectively  (r  =  0.5912  and  r  = 
2.780).  Figures  (c)  and  (d)  plot  the  same  data  in  the  case  of  shear  flow. 
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Fig.  (a)  Fig.  (b) 


Figure  4-6:  The  performance  of  the  extended  collision  operator  is  shown  during  re¬ 
peated  iterations.  The  error  is  plotted  against  Me  with  At  varying,  and  is  calculated 
at  the  hnal  time  T  =  1.0.  The  curves  correspond  to  the  hybrid  method  d2q7H,  to  the 
extended  collision  operator  d2q7X  using  hnite  differences  to  calculate  the  gradients, 
and  again  to  the  extended  collision  operator  d2q7X  using  the  known  analytic  solution 
to  calculate  the  gradients  (solid,  dashed,  dotted  lines).  Figure  (a)  shows  the  error  in 
simulating  the  hexagonal  Taylor  vortex,  and  hgure  (b)  shows  the  error  in  simulating 
the  hexagonal  shear  flow. 


value  problem. 


4.4.2  Iterating  the  extended  collision  operator 

This  section  examines  the  performance  of  the  extended  collision  operator  when  iter¬ 
ated  many  times.  We  recall  that  the  extended  collision  operator  uses  the  gradients 
of  the  fluid  velocity  to  control  the  viscosity.  Figure  4-6  shows  the  error  in  simulat¬ 
ing  the  hexagonal  Taylor  vortex  and  the  hexagonal  shear  flow  using  a  30  X  30  grid. 
The  error  is  plotted  against  Me  with  At  varying,  and  is  calculated  at  the  hnal  time 
r  =  1.0  when  the  maximum  velocity  of  the  hexagonal  Taylor  vortex  is  approximately 
1/10  of  its  initial  value.  The  curves  correspond  to  the  hybrid  method  d2q7H,  and  to 
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the  extended  collision  operator  d2q7X  using  finite  differences  to  calculate  the  gradi¬ 
ents,  and  again  to  the  extended  collision  operator  d2q7X  using  the  exact  solution  to 
calculate  the  gradients  (solid,  dashed,  dotted  lines).  When  the  curves  of  figure  4-6 
intersect  at  Me  =  0.026,  the  relaxation  parameter  r  of  the  standard  collision  operator 
is  equal  to  one,  and  the  coefficients  tcsi ,  tC32 ,  ^^32  of  the  extended  collision  operator 
vanish  (see  equation  4.32).  At  this  point,  the  extended  collision  operator  is  identical 
to  the  standard  collision  operator. 

As  Me  decreases  below  the  value  Me  =  0.026,  the  error  of  the  extended  collision 
operator  d2q7X  using  finite  differences  to  calculate  the  gradients  begins  to  grow  and 
approaches  relative  error  one  as  Me  goes  to  zero  (dashed  line).  By  contrast,  the  error 
of  the  extended  collision  operator  d2q7X  using  the  analytic  solution  to  calculate  the 
gradients  decreases  towards  a  minimum  error  (dotted  line)  which  is  determined  by 
the  spatial  discretization  error  of  the  30  X  30  grid.  This  shows  that  the  use  of  finite 
differences  creates  problems  after  repeated  iterations.  As  explained  in  section  4.2  the 
inexactness  of  finite  differences  produces  an  error  in  viscosity  which  accumulates  and 
becomes  large  after  repeated  iterations. 

The  hybrid  method  d2q7H  does  not  suffer  from  the  problems  of  the  extended  col¬ 
lision  operator  after  repeated  iterations  because  the  hybrid  method  uses  the  standard 
collision  operator  at  the  inner  nodes  after  the  first  step  (all  nodes  are  inner  in  this 
experiment).  Figure  4-6  shows  that  the  hybrid  method  performs  well  in  the  case  of 
periodic  boundary  conditions,  and  remains  accurate  as  Me  goes  to  zero  (solid  line).  In 
section  4.5,  it  is  shown  that  the  hybrid  method  performs  well  in  the  case  of  boundary 
value  problems  also. 

4.4.3  Comparison  with  projection  method 

This  section  compares  the  error  of  the  hybrid  method  d2q7H  and  the  error  of  an 
explicit  finite  difference  projection  method  in  simulating  the  hexagonal  Taylor  vortex 
and  the  hexagonal  shear  flow  with  periodic  boundary  conditions.  Both  of  these  flows 
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Figure  4-7:  The  error  of  the  lattice  Boltzmann  method  d2q7H  is  compared  against  the 
error  of  the  explicit  hnite  difference  projection  method  EP7.  The  curves  correspond 
to  d2q7H  using  30  X  30  grid,  d2q7H  using  60  X  60  grid,  EP7  using  30  X  30  grid,  and 
EP7  using  60  x  60  grid  (solid,  dashed,  dotted,  dash-dotted  lines).  Figures  (a)  and 
(b)  show  the  error  in  simulating  the  hexagonal  Taylor  vortex,  and  hgures  (c)  and  (d) 
show  the  error  in  simulating  the  hexagonal  shear  flow. 
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are  defined  in  the  hexagonal  region  0  <=  x  <=  2Tr  and  0  <=  y  <=  7r\/3,  which 
means  that  the  hnite  difference  projection  method  must  use  the  discretization  Ay  = 
Ax\/3/2.  Below,  we  refer  to  the  projection  method  with  the  symbol  EP7  when  it 
is  applied  to  a  hexagonal  region,  and  with  the  symbol  EP9  when  it  is  applied  to 
an  orthogonal  region  (this  is  done  in  later  sections).  The  explicit  hnite  difference 
projection  method  is  described  in  section  3.4. 

Eigure  4-7  (a)  plots  the  error  in  simulating  the  hexagonal  Taylor  vortex  against 
Me  with  At  varying.  The  error  is  calculated  at  the  hnal  time  T  =  1.0  when  the 
maximum  velocity  of  the  hexagonal  Taylor  vortex  is  approximately  1/10  of  its  initial 
value.  The  curves  correspond  to  d2q7H  using  30  X  30  grid,  d2q7H  using  60  X  60  grid, 
EP7  using  30  x  30  grid,  and  EP7  using  60  x  60  grid  (solid,  dashed,  dotted,  dash-dotted 
lines).  Eigure  (b)  plots  the  same  data  against  the  dimensionless  ratio  At  which 

facilitates  comparison  between  different  grids.  Eigures  (c)  and  (d)  plot  the  same  data 
for  shear  how.  We  can  see  that  the  Taylor  vortex  triggers  an  instability  in  the  explicit 
projection  method  EP7  when  At  u  j Ax'^  >=  0.2,  but  the  shear  how  does  not  trigger 
any  instability. 

With  regard  to  the  lattice  Boltzmann  method,  we  observe  that  it  fails  to  approx¬ 
imate  the  solution  (has  a  relative  error  of  EO)  when  Me  is  larger  than  0.2  approxi¬ 
mately.  In  the  case  of  the  Taylor  vortex,  which  is  a  solution  of  the  incompressible 
huid  how  equations,  it  may  appear  that  the  problem  arises  from  the  compressibility 
of  the  lattice  Boltzmann  huid  (when  Me  ~  0.2,  the  Mach  number  is  approximately 
M  =  1.53  Me  =  0.3).  In  the  case  of  the  shear  how,  however,  compressibility  is  not 
important.  The  shear  how  is  a  solution  of  the  compressible  huid  how  equations,  and 
it  should  be  easily  computed  by  the  lattice  Boltzmann  method  both  at  low  and  high 
Mach  numbers.  In  fact,  the  shear  how  can  be  computed  easily  at  high  Mach  numbers 
by  using  a  smaller  Wq^  for  example  Wq  =  10“®/3  (see  below). 

The  limitations  of  the  lattice  Boltzmann  method  shown  in  hgure  4-7  when  Me 
is  larger  than  0.2  persist  independent  of  the  Mach  number.  The  limitations  arise 
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because  the  microscopic  speed  Axj At  becomes  comparable  to  the  fluid  speed  when 
Me  approaches  1.0,  and  the  high-order  terms  in  the  Chapman- Enskog  expansion 
(which  are  neglected  in  deriving  the  Navier  Stokes  equations)  become  signiheant, 
and  produce  behavior  that  differs  from  the  Navier  Stokes  equations. 

With  regard  to  simulating  shear  flow  at  high  Mach  numbers,  we  can  choose  Wq  = 
10“®/3  which  gives  M  =  10^  Me-  The  error  of  the  lattice  Boltzmann  method  d2q7H 
in  simulating  shear  flow  with  Af  =  10^  Me  is  identical  to  the  error  plotted  in  figure  4- 
7(c).  The  error  in  simulating  shear  flow  is  independent  of  the  Mach  number  because 
the  density  gradients  are  zero  everywhere. 

4.4.4  Quadratic  convergence 

This  section  shows  that  the  lattice  Boltzmann  method  has  second-order  convergence 
both  in  space  and  in  time.  Second-order  convergence  in  space  means  that  the  error 
decreases  quadratically  with  Ax  while  keeping  the  dimensionless  ratio  At  v /  Ax"^  con¬ 
stant  (Fletcher  [18,  p.75]).  Second-order  convergence  in  time  means  that  the  error 
decreases  quadratically  with  At  while  keeping  the  space  discretization  Ax  constant. 
Furthermore,  we  are  interested  in  the  true  discretization  error  and  not  the  error  that 
arises  from  compressibility.  When  using  a  compressible  fluid  code  such  as  the  lattice 
Boltzmann  method  to  simulate  incompressible  flow  such  as  the  Taylor  vortex,  it  is 
important  to  distinguish  between  the  error  that  arises  from  compressibility  and  the 
error  that  arises  from  finite  discretization. 

In  figure  4-7  the  Mach  number  decreases  in  proportion  to  Me,  and  thus  the  effects 
of  compressibility  and  finite  discretization  can  not  be  distinguished  without  further 
analysis.  To  distinguish  between  the  effects  of  compressibility  and  discretization 
error,  we  perform  the  same  simulations  as  those  in  figure  4-7,  while  keeping  the  Mach 
number  constant  and  varying  the  density  coefficient  Wq  as  follows. 

If  At 


Wq 


3  V  Ax  M 


(4.68) 
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Fig.  (a)  Fig.  (b) 
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Figure  4-8:  The  error  of  d2q7H  is  plotted  against  Me  with  At  varying,  while  keeping 
the  Mach  number  M  constant  and  varying  the  density  parameter  Wq  (two  dashed 
lines).  For  comparison  purposes,  the  error  of  d2q7H  when  the  Mach  number  varies 
and  the  density  parameter  Wq  =  Ijl  is  held  constant  is  also  shown  (two  solid  lines). 
Results  are  shown  for  a  30  X  30  and  a  60  X  60  grid.  Figures  (a),  (b),  (c)  correspond  to 
the  hexagonal  Taylor  vortex  at  M  =  0.02,  the  hexagonal  Taylor  vortex  at  M  =  0.1, 
and  the  hexagonal  shear  flow  at  Af  =  0.05  respectively. 
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In  figure  4-8  (a),  we  show  the  error  of  d2q7H  in  simulating  the  hexagonal  Taylor 
vortex  at  constant  Mach  number  M  =  0.02  using  a  30  X  30  grid  and  a  60  X  60 
grid  (two  dashed  lines).  For  comparison  purposes,  we  also  show  the  error  of  d2q7H 
using  constant  Wq  =  1/7  and  variable  Mach  number  M  =  1.53  Me  (two  solid  lines). 
The  constant  Mach  number  curves  are  identical  to  the  constant  Wq  curves  except  for 
instabilities  which  are  discussed  below.  This  indicates  that  the  compressible  effects  at 
Mach  number  M  =  0.02  are  smaller  than  the  discretization  error  of  both  the  30  X  30 
and  60  X  60  grids.  The  instability  of  the  constant  Mach  number  curves  (dashed 
lines)  is  expected  and  it  occurs  when  the  density  coefficient  Wq  given  by  equation  4.68 
becomes  greater  than  1/6  which  forces  the  density  coefficient  Zq  to  become  negative. 
Similar  instabilities  can  be  seen  in  hgure  4-8  (c)  which  plots  the  same  experiment  for 
shear  flow  at  constant  Mach  number  M  =  0.05. 

It  is  important  to  note  that  if  we  keep  the  Mach  number  constant  while  decreas¬ 
ing  the  grid  spacing  (Ax),  then  a  sufficiently  hue  grid  will  eventually  bring  out  the 
compressible  effects.  For  example,  hgure  4-8  (b)  shows  the  same  data  as  hgure  4-8  (a) 
while  keeping  the  Mach  number  constant  at  M  =  0.1.  In  the  case  of  the  30  X  30  grid 
the  constant  Mach  number  curves  are  identical  to  the  constant  Wq  curves  as  before, 
which  indicates  that  the  discretization  error  of  the  30  X  30  grid  is  larger  than  the 
compressible  effects  of  Mach  number  M  =  0.1.  In  the  case  of  60  X  60  grid  however, 
the  constant  Mach  number  curves  reach  a  minimum  error  (as  At  goes  to  zero)  that  is 
much  greater  than  the  minimum  error  of  the  constant  Wq  curves.  This  is  because  the 
discretization  error  of  the  60  X  60  grid  becomes  smaller  than  the  compressible  effects 
of  Mach  number  M  =  0.1  when  At  u  j  Ax'^  becomes  smaller  than  0.1  approximately. 

In  general,  we  can  calculate  the  Mach  number  at  which  compressible  effects  be¬ 
come  larger  than  the  discretization  error  of  any  grid  by  doing  more  numerical  experi¬ 
ments  of  the  kind  shown  in  hgure  4-8.  Such  a  study  is  not  necessary  for  our  purposes 
however.  Figures  4-8(a)  and  4-8(b)  are  enough  to  show  that  the  compressible  effects 
in  simulating  the  Taylor  vortex  are  smaller  than  the  discretization  error  of  the  30  X  30 
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and  60  X  60  grids  when  Wq  is  constant  and  the  Mach  number  varies  as  Af  =  1.53  Me. 
Accordingly,  we  can  examine  the  error  curves  of  hgure  4-8  and  also  of  hgure  4-7  to 
hnd  out  how  the  discretization  error  of  the  lattice  Boltzmann  method  decreases  with 
hner  resolution. 

If  we  examine  the  logarithmic  plots  of  hgure  4-7,  we  see  that  the  error  decreases 
quadratically  with  At  (it  has  a  slope  of  —2)  until  a  minimum  spatial  discretization 
error  is  reached.  In  addition  the  error  decreases  by  a  factor  of  4  when  we  go  from 
the  30  X  30  grid  to  the  60  x  60  grid  while  keeping  the  dimensionless  ratio  At  u  j  Ax'^ 
constant,  see  hgures  4-7  (b)  and  4-7  (d).  In  other  words  the  lattice  Boltzmann  method 
has  second-order  convergence  both  in  space  and  in  time.  In  section  4.5  we  will  verify 
the  second-order  convergence  for  boundary  value  problems  also.  The  explicit  hnite 
difference  projection  method  EP7  has  hrst-order  convergence  in  time  and  second- 
order  convergence  in  space.  The  hrst-order  convergence  in  time  of  the  projection 
method  EP7  can  be  seen  most  easily  in  hgures  4-7  (c)  and  4-7  (d). 

4.4.5  7-speed  versus  9-speed 

Here,  the  accuracy  of  the  hexagonal  7-speed  model  is  compared  against  the  accuracy 
of  the  orthogonal  9-speed  model.  Eigure  4-9  shows  the  error  of  d2q7H  applied  to  the 
hexagonal  Taylor  vortex,  and  the  error  of  d2q9H  applied  to  the  orthogonal  Taylor 
vortex  (solid  and  dashed  lines).  In  addition,  the  error  of  the  explicit  hnite  difference 
projection  method  is  shown  when  the  projection  method  is  applied  to  the  hexagonal 
Taylor  vortex  with  Ay  =  Ax\/3/2^  and  also  to  the  orthogonal  Taylor  vortex  with 
Ay  =  Ax  (dotted  and  dash-dotted  lines).  A  30  X  30  grid  is  used,  and  the  error  is 
calculated  at  the  hnal  time  T  =  1.0.  We  can  see  that  the  explicit  hnite  difference 
projection  method  performs  similarly  on  the  hexagonal  and  the  orthogonal  Taylor 
vortices.  By  contrast,  the  orthogonal  9-speed  model  d2q9H  is  signihcantly  more 
accurate  than  the  hexagonal  7-speed  model  d2q7H  on  this  specihe  problem. 
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Figure  4-9:  The  error  of  d2q7H  applied  to  the  hexagonal  Taylor  vortex,  and  the  error 
of  d2q9H  applied  to  the  orthogonal  Taylor  vortex  are  shown  (solid  and  dashed  lines). 
In  addition  the  error  of  the  explicit  hnite  difference  projection  method  is  shown  when 
applied  to  the  hexagonal  Taylor  vortex  with  Ay  =  Ax\/3/2  and  also  the  orthogonal 
Taylor  vortex  with  Ay  =  Ax  (dotted  and  dash-dotted  lines). 

4.5  Experiments  —  boundary  value 

In  this  section,  the  orthogonal  9-speed  hybrid  model  d2q9H  is  tested  on  boundary 
value  problems  with  exact  solutions,  and  is  also  compared  against  the  explicit  hnite 
difference  projection  method  EP9.  In  all  of  the  test  cases  examined  here,  both  the 
density  and  the  velocity  values  are  specihed  exactly  at  the  boundary.  The  question 
of  how  to  compute  the  density  at  a  boundary  (such  as  a  non-slip  wall)  using  the 
computed  solution  is  discussed  later  in  section  4.6.1. 

The  boundary  value  problems  are  the  one-quarter  Taylor  vortex,  the  Hagen- 
Poiseuille  how,  and  the  oscillating  plate  above  a  stationary  wall.  Figure  4-10  shows 
the  velocity  vector  helds  of  these  hows,  and  also  indicates  the  boundary  nodes  of  each 
how  by  drawing  a  square  around  the  boundary  nodes.  Figure  4-10  (c)  is  plotted  at 
time  t  =  0.4  when  the  oscillating  plate  starts  moving  to  the  left  while  the  huid  below 
is  still  moving  to  the  right. 
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Fig.  (a) 


Fig.  (b) 


X 


Fig.  (c) 


Figure  4-10:  The  velocity  field  of  the  one-quarter  Taylor  vortex,  the  Hagen-Poiseuille 
flow,  and  the  oscillating  plate  problem  are  shown  in  hgures  (a),  (b),  (c)  respectively. 
Boundary  nodes  are  marked  with  a  square.  Figure  (c)  is  plotted  at  time  t  =  0.4  when 
the  oscillating  plate  starts  moving  to  the  left  and  the  fluid  below  is  still  moving  to 
the  right. 
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The  one-quarter  Taylor  vortex  is  defined  in  the  region  7r/2  <=  x  <=  37r/2  and 
7r/2  <=  y  <=  37r/2.  The  exact  solution  is  given  by  equation  4.62  with  A  =  B  =  1. 
The  velocity  and  pressure  are  specihed  at  the  boundary  by  evaluating  the  exact 
solution  at  the  horizontal  and  vertical  lines  7r/2  <=  x  <=  37r/2  and  7r/2  <=  y  <  = 
37r/2.  From  the  pressure,  we  calculate  the  density  using  equation  4.33. 

The  Hagen-Poiseuille  flow  is  dehned  in  the  region  0  <=  x  <=  1  and  0  <=  j/  <=  1. 
The  analytic  solution  is  as  follows, 


V^{x,yO)  =  -{y^  -  y)  AP  /  {2p) 

Vy{x,y,t)  =  0  (4.69) 

P{x,yO)  =  {0.5  —  x)  AP 


The  pressure  gradient  AP  is  chosen  AP  =  (8. On)  so  that  the  maximum  fluid  speed 
is  1.0  when  y  =  1/2.  The  velocity  and  the  density  are  specihed  at  the  boundary  by 
evaluating  the  exact  solution  at0<=x<=l  and  0  <=  j/  <=  1. 

The  oscillating  plate  problem  is  dehned  in  the  region  0  <=  x  <=  1  and  0  <  = 
y  <=  1  with  periodic  boundary  conditions  in  the  horizontal  direction  x  =  0  and 
X  =  1.  The  velocity  is  specihed  at  the  top  and  bottom  plates  by  evaluating  the  exact 
solution,  namely. 


Vx  =  co,s{ut)  Vy  =  0 


Vy=0 


(4.70) 


y  =  l\ 

y  =  0  ■.  Vx  =  0 

The  density  at  the  top  and  bottom  plates  is  set  equal  to  1.0  (the  exact  solution  has 
constant  pressure  everywhere).  The  frequency  of  oscillation  u  is  chosen  cu  =  20  so 
that  the  oscillating  plate  executes  3.18  cycles  of  oscillation  during  the  time  interval 
r  =  1.0  which  is  used  for  testing  (this  is  an  arbitrary  choice).  The  analytic  solution 
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of  the  oscillating  plate  problem  (see  section  2.5.3)  is  given  by  the  following  equations, 


Vx{x,yO)  =  (cosh  A  sin  A (—2  cosh  i?  sin  i?  cos  cut  +  2  cos  i?  sinh  i?  sin  cut) 

—  cos  A  sinh  A(2  cosh  B  sin  B  sin  cut  +  2  cos  B  sinh  B  cos  iot)) 

/  (cos  2i?  —  cosh  2i?)  (4-71) 

Vy{x,yO)  =  0 

P[x^yO)  =  constant 


where  A  =  y  \Jlo  j {2v)  and  B  =  \Jlo  j {2v)^  and  v  is  the  kinematic  viscosity. 

In  the  case  of  steady  flow  such  as  the  Hagen-Poiseuille  flow,  we  initialize  the 
variables  p,  14,14  equal  to  the  exact  steady  state  solution.  Then,  we  iterate  for  100 
steps,  and  test  whether  the  fluid  is  in  steady  state.  If  the  fluid  is  in  steady  state,  we 
measure  the  velocity  relative  error  .  Otherwise,  we  keep  iterating  until  the  fluid 
reaches  steady  state.  The  goal  of  this  procedure  is  to  measure  the  error  at  steady 
state  and  not  to  characterize  how  quickly  the  fluid  reaches  steady  state.  The  criterion 
for  steady  state  is  that  the  relative  change  in  velocity  between  successive  iterations 
divided  by  At  must  be  less  than  10“®, 

Y.x,y  |44(f  +  At)  -  14(41 


<  10“®  At 


(4.72) 


and  similarly  for  14- 

In  the  case  of  transient  flow  such  as  the  one-quarter  Taylor  vortex  and  the  oscil¬ 
lating  plate,  the  error  is  measured  at  the  hnal  time  T  =  1.0  using  equations  4.64 
and  4.65. 


4.5.1  Comparison  between  LB  boundary  schemes 

The  hybrid  method  d2q9H  uses  the  standard  collision  operator  at  the  inner  nodes, 
and  the  extended  collision  operator  at  the  boundary  nodes.  An  important  issue  is 
the  calculation  of  the  gradients  of  the  fluid  velocity  at  the  boundary  nodes.  The  best 
results  are  achieved  when  the  gradients  of  the  fluid  velocity  are  specihed  using  the 
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exact  solution.  In  practice,  however,  the  velocity  gradients  at  the  boundary  nodes  are 
usually  not  known.  For  example,  the  gradient  dVxj dy  at  the  top  and  bottom  walls  of 
the  driven  cavity  problem  (not  reported  here  but  see  Peyret&Taylor  [38,  p.l99])  can 
not  be  specihed  because  it  is  part  of  the  solution  that  we  seek  to  compute.  When  a 
velocity  gradient  can  not  be  specihed,  hnite  differences  must  be  used  to  estimate  it. 

In  the  experiments  below,  different  ways  of  specifying  the  velocity  gradients  at 
the  boundary  are  tested.  First,  the  exact  solution  is  used  to  specify  all  of  the  veloc¬ 
ity  gradients  at  the  boundary  nodes.  Second,  hnite  differences  are  used  to  estimate 
all  the  velocity  gradients  at  the  boundary  nodes.  When  the  exact  solution  is  used, 
the  method  is  denoted  by  {^D  stands  for  exact  derivatives  at  the  bound¬ 

ary).  When  hrst-order  asymmetric  differences  are  used,  the  method  is  denoted  by 
d2q9H;^^j3.  When  second-order  asymmetric  differences  are  used,  the  method  is  de¬ 
noted  by  d2q9H2^£). 

In  the  experiments  below,  we  also  test  the  lattice  Boltzmann  scheme  d2q9F0 
which  uses  the  standard  collision  operator  at  every  node,  both  boundary  and  inner 
nodes.  At  the  boundary  nodes,  the  method  d2q9F0  sets  the  populations  Fi  equal  to 
the  equilibrium  values  of  the  standard  collision  operator  given  by  equation  4.13. 
At  startup,  the  method  d2q9F0  normally  initializes  the  Fi  equal  to  the  equilibrium 
values  of  the  standard  collision  operator.  In  the  present  section,  however,  the 
extended  collision  operator  is  used  for  initialization  in  order  to  avoid  initial  errors, 
and  the  standard  collision  operator  is  used  after  the  hrst  step. 

Regarding  boundary  conditions  for  the  explicit  hnite  difference  projection  method, 
the  velocity  at  the  boundary  is  specihed  from  the  exact  solution,  and  the  pressure 
P  is  specihed  from  the  requirement  dP j dn  =  0  at  the  boundary,  where  dn  denotes 
the  direction  normal  to  the  boundary  (Peyret&Taylor  [38,  p.l60]).  The  condition 
dP ! dn  =  0  is  applied  at  the  beginning  of  the  SOR  calculation  using  the  values  of  P 
at  the  previous  time  step,  and  the  resulting  boundary  values  for  the  pressure  P  are 
held  constant  throughout  the  SOR  calculation. 
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Figure  4-11:  The  error  of  d2q9Hjf  ji,  d2q9Hii7’j:),  d2c^H2FD^  d2q9F0  (solid,  dashed, 
dotted,  and  dash-dotted  lines)  is  shown  in  simulations  of  the  one-quarter  Taylor 
vortex,  the  Hagen-Poiseuille  flow,  and  the  oscillating  plate  —  hgures  (a),  (b),  (c) 
respectively. 
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Figure  4-11  compares  the  methods  d2q9Hjfj3,  d2q9H;^^j3,  d2q9H2^£),  and  d2q9F0 
(solid,  dashed,  dotted,  and  dash-dotted  lines)  in  simulations  of  the  one-quarter  Taylor 
vortex,  the  Hagen-Poiseuille  flow,  and  the  oscillating  plate,  hgures  (a),  (b),  (c)  re¬ 
spectively.  A  30  X  30  grid  is  used,  and  the  error  is  plotted  against  Me  with  At  varying, 
and  is  calculated  at  the  hnal  time  T  =  1.0.  We  can  see  that  the  standard  collision 
operator  d2q9F0  achieves  smallest  error  when  the  relaxation  parameter  r  =  1,  at 
which  point  the  standard  and  extended  collision  operators  are  identical.  We  can  also 
see  that  the  hybrid  method  achieves  best  results  when  the  velocity  gradients  at  the 
boundary  nodes  are  specihed  from  the  exact  solution  (method  d2q9Hjfj3).  Further,  we 
can  see  that  the  hnite  differences  at  the  boundary  (d2q9H;^^j3  and  d2q9H2^£))  trigger 
instabilities  when  Me  becomes  large,  and  that  hrst-order  differences  are  a  little  more 
stable  than  second-order  differences.  However,  second-order  differences  are  recom¬ 
mended  because  they  are  more  accurate  with  regard  to  the  error  in  pressure  (which 
is  not  shown  here,  but  see  page  225).  As  explained  on  page  140,  all  the  numerical 
tests  of  this  chapter  examine  the  error  in  velocity  only. 

4.5.2  Comparison  with  incompressible  finite  differences 

Figure  4-12  compares  the  error  of  the  lattice  Boltzmann  method  (Mc^Hxd  against 
the  error  of  the  incompressible  hnite  difference  projection  method  FP9  in  simulations 
of  the  one-quarter  Taylor  vortex,  the  Hagen-Poiseuille  how,  and  the  oscillating  plate, 
hgures  (a),  (b),  (c)  respectively.  The  error  is  plotted  against  the  dimensionless  ratio 
At  u I  Ax'^  to  facilitate  comparison  between  different  grids.  The  curves  correspond  to 
d2q9Hjfj3  using  30  x  30  grid,  d2q9}lxD  using  60  X  60  grid,  BP9  using  30  x  30  grid, 
and  BP9  using  60  X  60  grid  (solid,  dashed,  dotted,  dash-dotted  lines).  Figure  (b) 
shows  most  clearly  the  rate  of  convergence  in  time.  The  lattice  Boltzmann  method 
has  second-order  convergence  in  time  (slope  —2),  and  the  hnite  difference  method 
BP9  has  hrst-order  convergence  in  time  (slope  —1).  Both  methods  have  second-order 


convergence  m  space. 
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Figure  4-12:  The  error  of  the  lattice  Boltzmann  method  d2q9Hjf  is  compared  against 
the  error  of  the  incompressible  hnite  difference  method  EP9.  The  curves  correspond 
to  d2q9Hjfj3  using  30  X  30  grid,  d2q9}lxD  using  60  X  60  grid,  EP9  using  30  x  30  grid, 
and  EP9  using  60x60  grid  (solid,  dashed,  dotted,  dash-dotted  lines).  Figures  (a),  (b), 
(c)  show  simulations  of  the  one-quarter  Taylor  vortex,  the  Hagen-Poiseuille  flow,  and 
the  oscillating  plate  respectively.  Figure  (d)  shows  the  same  experiment  as  hgure  (a) 
using  d2q9H^^j3  instead  of  dZc^HxD- 
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It  is  worth  noting  that  the  lattice  Boltzmann  method  has  second-order  convergence 
overall  even  when  hrst-order  differences  are  used  to  calculate  the  velocity  gradients 
at  the  boundary  nodes.  This  can  be  seen  in  figure  4-12  (d)  which  corresponds  to 
the  same  experiment  as  figure  4-12  (a)  but  uses  the  method  d2q9H;^^j3  instead  of  the 
method  AIc^Hxd- 

4.6  More  on  boundary  conditions 

4.6.1  Density  calculation  at  non-slip  wall 

The  modeling  of  a  non-slip  wall  using  the  lattice  Boltzmann  method  is  discussed  here. 
Suitable  boundary  conditions  must  insure  that  the  velocity  components  14, 14  vanish 
at  a  non-slip  wall,  and  further  that  there  is  a  way  of  calculating  the  density  at  a 
non-slip  wall.  There  are  basically  two  approaches  of  imposing  boundary  conditions 
at  a  non-slip  wall  using  the  lattice  Boltzmann  method.  The  first  approach  is  the 
traditional  bounce-back  of  the  populations,  which  was  in  section  4.2.1.  The  second 
approach,  which  is  used  in  the  simulations  of  flue  pipes,  employs  the  extended  collision 
operator  of  section  4.2. 

The  traditional  approach  is  to  bounce-back  the  populations  Fi  which  are  moving 
outwards,  so  as  to  produce  incoming  populations.  As  stated  earlier  in  section  4.2.1, 
the  approach  of  bounce-back  leads  to  a  non-slip  wall  which  is  located  somewhere 
beyond  the  last  set  of  nodes  of  the  grid,  usually  a  distance  of  Aa;/2  away.  However, 
the  exact  location  of  the  wall  is  not  known,  and  may  vary  with  the  flow  conditions 
near  the  boundary  (Cornubert&et  ah  [12],  Ginzbourg&Adler  [21]).  Regarding  the 
density,  the  calculation  of  density  at  the  wall  is  not  an  issue  because  there  are  no 
fluid  nodes  located  on  the  non-slip  wall.  The  density  at  the  nodes  nearest  the  wall  is 
performed  in  the  same  way  as  for  all  the  interior  nodes. 

The  second  approach  of  modeling  a  non-slip  wall,  which  is  used  in  the  simulations 
of  flue  pipes,  employs  the  extended  collision  operator  of  section  4.2  together  with 
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bounce-back  as  follows.  First,  a  bounce-back  of  the  outgoing  populations  is  performed 
in  order  to  produce  incoming  populations  which  are  used  to  calculate  the  density  at 
the  wall.  ®  Then,  the  extended  collision  operator  is  applied  at  the  nodes  of  the 
wall  using  =  Vy  =  0.  The  gradients  of  14,14  which  are  needed  by  the  extended 
collision  operator  are  calculated  using  hnite  differences.  The  beneht  of  using  the 
extended  collision  operator  at  the  boundary  nodes  is  that  the  non-slip  wall  is  located 
precisely  at  the  boundary  nodes  within  numerical  error. 

4.6.2  Composite  grid  for  lattice  Boltzmann 

This  section  outlines  how  to  implement  composite  grids  for  the  lattice  Boltzmann 
method  using  the  extended  collision  operator. 

The  extended  collision  operator  can  be  used  to  join  a  lattice  Boltzmann  grid  with 
a  hnite  difference  grid  of  the  same  resolution.  For  this  purpose,  a  single  layer  of  over¬ 
lapping  nodes  must  be  used.  At  the  overlapping  nodes,  the  future  values  of  p,  14,  K/ 
are  calculated  using  the  hnite  difference  method.  Subsequently,  the  future  values  of 
/0, 14,14  (already  calculated  by  hnite  differences)  are  used  to  initialize  populations 
Fi  at  the  overlapping  nodes,  which  are  used  as  boundary  conditions  for  the  lattice 
Boltzmann  method  on  the  other  side  of  the  grid. 

The  scheme  for  a  composite  grid  is  as  follows.  Let  us  assume  that  lattice  Boltz¬ 
mann  is  used  on  a  coarse  grid  at  the  left  side.  Going  from  left  to  right,  there  is  a 
point  where  we  change  from  lattice  Boltzmann  to  hnite  differences.  Further  on,  the 
resolution  of  the  hnite  difference  grid  is  changed  to  a  hner  resolution.  For  simplic¬ 
ity,  let  us  assume  that  the  resolution  on  the  right  side  is  twice  the  resolution  on  the 
left  side.  Traditional  interpolation  can  be  used  to  join  the  two  hnite  difference  grids 
of  different  resolution.  Further  on,  as  we  move  to  the  right,  we  change  from  hnite 

®In  my  earlier  paper  [48],  I  suggested  that  the  density  at  a  wall  should  be  calculated  as  the 
average  of  the  populations  that  “bring  fluid  into  the  boundary  node”  from  inner  nodes  and  other 
neighboring  boundary  nodes.  Further  numerical  experiments,  however,  indicate  that  the  average  of 
the  populations  after  bounce-back,  which  is  recommended  above,  is  a  slightly  better  approach. 
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differences  to  lattice  Boltzmann  at  the  fine  resolution. 

An  issue  to  remember  is  that  the  speed  of  sound  in  the  lattice  Boltzmann  method 
is  proportional  to  ydhhAx/At  where  Wq  is  a  density  parameter.  Therefore,  if  the 
spacing  Ax  is  halved,  the  density  parameter  must  be  divided  by  4,  or  the  time  step 
At  must  be  halved  also.  In  the  former  case,  the  same  time  step  is  used  globally, 
and  is  determined  by  the  finest  grid  or  the  smallest  spacing  Ax.  In  the  latter  case, 
some  computation  is  saved  in  the  coarse  grid.  In  particular,  twice  as  many  steps  are 
performed  at  the  finer  resolution  grid  than  at  the  coarser  grid.  This  must  be  taken 
into  account  in  the  transition  region  where  finite  differences  and  interpolation  are 
used.  Presumably,  the  coarse-grid  values  can  be  held  constant  every  other  “fine”  step 
of  the  fine  grid. 

The  transition  between  grids  of  different  resolution  inevitably  introduces  some 
error.  The  desired  goal  is  that  the  transition  error  (interpolation  error,  etc)  should 
not  be  larger  than  the  error  difference  between  the  fine  and  the  coarse  grid.  This 
must  be  tested  especially  with  regard  to  the  propagation  of  acoustic  waves. 

Finally,  we  might  wonder  why  switch  back  and  forth  between  lattice  Boltzmann 
and  finite  differences,  why  not  stay  with  finite  differences  all  the  time.  The  answer 
is  that  lattice  Boltzmann  may  provide  better  stability  properties,  better  handling  of 
boundary  conditions,  and  better  modeling  of  acoustic  waves.  These  issues  need  to  be 
investigated  further  in  the  future. 

4.7  Appendix 

4.7.1  Roundoff  error  of  lattice  Boltzmann 

In  this  section,  the  numerical  roundoff  error  of  the  lattice  Boltzmann  method  is 
discussed  using  the  7-speed  hexagonal  LB  model  for  simplicity.  It  is  shown  that 
the  roundoff  error  in  the  equilibrium  population  formulas  can  cause  problems  under 
certain  conditions.  In  particular,  it  is  shown  that  the  roundoff  error  increases  as 
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the  ratio  Vjc  becomes  smaller  (namely,  as  the  Mach  number  becomes  smaller),  or 
as  the  ratio  AxjAt  becomes  larger.  The  increasing  roundoff  error  is  undesirable 
because  large  values  of  Ax  j At  are  useful  for  improving  the  accuracy  (reducing  the 
discretization  error)  of  the  lattice  Boltzmann  method.  Fortunately,  double-precision 
arithmetic  mitigates  the  roundoff  error  to  a  large  extent. 

Let  us  consider  the  implementation  of  the  lattice  Boltzmann  method  according  to 
the  equations  4.8,4.10,4.13.  The  roundoff  error  (numerical  loss  of  precision)  arises  in 
the  computation  of  the  equilibrium  populations  using  equation  4.13.  This  formula  is 
a  sum  of  four  terms.  If  we  factor  out  the  density  p(T,  f),  the  first  term  is  a  constant 
coefficient  Wq  and  the  remaining  terms  are  proportional  to  ff/c,  (ff/c)^,  and  (ff/c)^ 
respectively  (see  table  4.1).  Consequently  when  Vjc  is  small,  for  example  Vjc  ~  10“^, 
the  terms  to  be  added  have  very  disparate  sizes  and  their  sum  suffers  a  significant 
loss  of  accuracy  when  the  computer  aligns  the  numbers  to  be  added  (about  5  or  6 
decimal  places  when  Vjc  ~  10“^).  If  single-precision  arithmetic  is  used  (about  eight 
decimal  places),  then  the  loss  of  five  digits  is  a  serious  problem. 


term 

Wo 

Wi 

W20 

W21 

size 

1 

Vjc 

(I//c)2 

(I//c)2 

Table  4.1:  The  terms  of  the  equilibrium  population  formula  have  different  sizes.  When 
they  are  added  together,  numerical  roundoff  error  can  be  significant. 

Below,  numerical  experiments  are  described  based  on  single-precision  computer 
arithmetic,  which  indicate  that  the  error  of  the  lattice  Boltzmann  method  decreases 
at  first  as  the  speed  Ax  j  At  increases,  but  after  some  point  the  error  starts  to  increase 
with  larger  AxjAt.  For  example  in  the  Taylor  vortex  when  the  maximum  fluid  speed 
is  1.0,  the  error  starts  to  increase  at  the  rate  of  [Ax j At)^'"^  when  [Ax / At)  is  larger 
than  300.  Fortunately,  the  error  growth  disappears  when  double-precision  arithmetic 
is  used,  and  this  confirms  that  the  breakdown  of  the  method  is  caused  by  roundoff 


error. 
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An  approximate  estimate  of  the  extent  of  roundoff  problems  is  that  increasing  the 
ratio  Ax  I  At  by  a  factor  of  10  increases  the  roundoff  error  in  the  equilibrium  popula¬ 
tions  by  a  decimal  digit.  Therefore,  double-precision  arithmetic  provides  roundoff- free 
operation  with  ratios  Ax / At  which  are  10^  times  larger  than  the  corresponding  ra¬ 
tios  in  single-precision  arithmetic.  Clearly,  this  is  a  very  wide  margin  for  practical 
calculations. 

Apart  from  using  double-precision  arithmetic,  there  is  an  algebraic  transforma¬ 
tion  which  reduces  the  roundoff  error  in  the  equilibrium  populations,  and  it  can 
be  used  in  all  cases  because  it  does  not  involve  any  additional  cost.  The  algebraic 
transformation  does  not  eliminate  the  roundoff  error  however,  and  double-precision 
arithmetic  remains  necessary.  The  idea  is  to  modify  the  populations  Fi  dehned  by 
equations  4.8,  4.10,  4.13  as  follows. 


F,  =  F,  -  Wo  <  p> 

=  Fr-wo<p> 


(4.73) 


where  the  spatial  average  density  <  p  >  is  constant  in  time  and  typically  equal  to 
one.  The  non-moving  population  become  Fq  =  Fq  —  Zq  <  p  >  .  The  conservation 
relations  are  modihed  accordingly. 


p{xO)  =  E%oF^{xA)+  <p> 

p{xO)V{xO)  =  T!^=lF^{xO)C 
The  new  equilibrium  population  formulas  are  as  follows, 

FC\xO)  =  wo{p{xO)  -  <p>  )  A 

p{x,t)  \wi{C  ■  V)  +  W2o{C  ■  V){C  ■  V)  +  W2l{V  ■  V) 


(4.74) 


(4.75) 


Fo®'^(T,  t)  =  zo{p{xO)  -  <  p>  )  +  p{xO)  Z2i{V  ■  V) 

The  new  equilibrium  population  formulas  are  numerically  better  than  the  original 
ones  because  the  term  that  used  to  be  Wq  p  is  now  Wq  [p  —  <  p>  ).  The  new  quantity 
(p  —  <p>  )  is  of  the  order  PjiltCwo)  and  the  pressure  P  is  of  the  order  pV^  as  can 
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Figure  4-13:  The  error  of  the  lattice  Boltzmann  method  d2q7H  is  shown  when  single¬ 
precision  arithmetic  is  used,  when  single-precision  arithmetic  together  with  the  alge¬ 
braic  transformation  of  section  4.7.1  is  used,  and  when  double-precision  arithmetic  is 
used  (dotted,  dashed,  solid  lines). 

be  seen  from  the  Navier  Stokes  equations.  Hence  the  expression  Wq  [p  —  <  p  >  )  is 
of  the  order  piV j .  The  new  formulas  compute  the  same  quantities  as  the  original 
formulas,  and  they  incur  a  smaller  loss  of  precision.  Loss  of  precision  still  occurs  when 
the  terms  proportional  to  {V j c)  and  (H/c)^  are  combined. 

To  verify  the  above  analysis,  hgure  4-13  compares  the  error  of  the  lattice  Boltz¬ 
mann  method  (d2q7H  version)  when  single-precision  arithmetic  is  used,  when  the  al¬ 
gebraic  transformation  (together  with  single-precision  arithmetic)  is  used,  and  when 
double-precision  arithmetic  is  used  (dotted,  dashed,  solid  lines).  The  data  comes  from 
simulations  of  the  hexagonal  Taylor  vortex  with  periodic  boundary  conditions  and 
30  X  30  grid.  The  error  is  plotted  against  Me  with  At  varying  and  is  calculated  at  the 
hnal  time  T  =  1.0.  We  see  that  when  single-precision  arithmetic  is  used,  and  the  speed 
Ax ! At  exceeds  300  (therefore  Me  <  0.003),  there  is  a  growth  of  error  that  is  caused 
by  numerical  roundoff.  The  algebraic  transformation  with  single-precision  arithmetic 
can  reduce  the  roundoff  error  but  can  not  prevent  it.  Double-precision  arithmetic  is 
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necessary  to  prevent  the  error  growth  in  the  Taylor  vortex  for  Me  <  0.003. 

4.7.2  Lattice  gas  methods 

This  section  discusses  some  background  material  regarding  the  relation  between  lat¬ 
tice  Boltzmann  and  lattice  gas  methods. 

The  lattice  Boltzmann  approach  for  simulating  fluids  is  an  outgrowth  of  the  lattice 
gas  approach  [20,  15,  58,  23].  Both  of  these  approaches  have  their  origins  in  the  kinetic 
theory  of  gases.  The  common  idea  behind  them  is  that  the  advection  and  collision 
of  particles  can  lead  to  the  Navier- Stokes  equations  when  the  collision  of  particles 
conserves  mass,  momentum,  and  energy.  Furthermore,  the  particles  must  move  along 
the  edges  of  a  numerical  grid  that  is  highly  symmetric  [20,  15,  58]  and  is  called 
a  lattice.  For  example,  typical  grids  in  two  dimensions  are  the  hexagonal  and  the 
orthogonal  lattices.  In  three  dimensions,  a  cubic  lattice  is  commonly  used. 

One  difference  between  the  lattice  gas  and  the  lattice  Boltzmann  approach  is 
that  the  former  represents  the  lattice  particles  with  binary  values  0  or  1,  while  the 
latter  represents  the  particles  with  floating-point  numbers.  A  binary  value  0  or  1 
represents  the  absence  or  presence  of  a  single  particle,  while  a  floating-point  number 
represents  a  density  of  particles.  The  change  from  single-bit  variables  to  floating¬ 
point  numbers  has  important  consequences.  From  a  mathematical  point  of  view,  the 
lattice  Boltzmann  method  is  easier  to  analyze  and  more  flexible  than  the  lattice  gas 
method.  In  addition,  the  lattice  Boltzmann  method  does  not  require  averaging  to 
remove  statistical  noise  as  does  the  lattice  gas  method. 

One  advantage  of  lattice  gas  over  lattice  Boltzmann  is  that  single-bit  operations 
may  be  desirable  for  special-purpose  computers  and  for  future  technologies  (quantum- 
bit  computers  have  been  mentioned  in  this  context).  Today,  almost  all  computers 
are  designed  for  floating-point  operations,  and  they  are  well-suited  for  the  lattice 
Boltzmann  approach.  However,  special  purpose  computers  have  been  built  for  single¬ 
bit  operations  of  the  lattice  gas  approach  [53],  and  they  are  promising. 
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In  comparing  lattice  gas  and  lattice  Boltzmann  approaches,  it  helps  to  view  the 
lattice  Boltzmann  method  as  a  lattice  gas  method  with  a  very  large  number  of  parti¬ 
cles  per  direction  as  opposed  to  one  or  zero  particles.  Further,  we  may  observe  that 
the  number  of  velocity  directions  at  each  lattice  node  is  a  small  number  for  the  lat¬ 
tice  Boltzmann  method  (to  reduce  computer  memory  requirements),  while  it  varies 
from  small  to  large  for  lattice  gas  methods.  This  is  important  because  it  has  been 
reported  [17]  that  lattice  gas  methods  with  a  large  number  of  velocity  directions  are 
more  flexible  and  closer  to  correct  hydrodynamics  than  lattice  gas  methods  with  a 
very  small  number  of  velocity  directions.  If  this  is  true,  then  we  may  conclude  that 
there  are  two  ways  to  improve  lattice  gas  methods:  either  by  increasing  the  num¬ 
ber  of  particles  per  direction  (which  eventually  produces  lattice  Boltzmann),  or  by 
increasing  the  number  of  velocity  directions  per  lattice  node. 

To  carry  the  above  discussion  further,  we  may  ask,  “what  about  intermediate 
schemes  which  increase  both  the  number  of  directions  per  node  and  the  number 
of  particles  per  direction?”  For  example,  the  9-speed  lattice  Boltzmann  model  of 
section  4.3  uses  9  double-precision  floating  point  numbers  (64  bits  x9)  per  lattice 
node  because  it  has  8  directions  and  one  non-moving  population.  An  intermediate 
scheme  with  equivalent  amount  of  memory  might  use  72  directions  per  lattice  node 
with  2®  particles  (one  byte)  per  direction.  Would  such  an  intermediate  scheme  perform 
better  than  lattice  gas  and  lattice  Boltzmann?  In  general,  the  question  is  to  hud  the 
optimal  distribution  of  bit-information  to  the  physical  degrees  of  freedom  (number  of 
directions,  and  number  of  particles  per  direction).  This  is  an  unsolved  problem. 


Chapter  5 

Artificial-viscosity  filter 


This  chapter  discusses  the  need  for  an  artihcial- viscosity  hlter  for  dissipating  nu¬ 
merical  instabilities  of  high  spatial  frequency.  Such  a  hlter  must  be  used  both  with 
the  lattice  Boltzmann  method  and  with  the  compressible  hnite  difference  method  of 
section  3.3  for  hows  with  high  Reynolds  number. 

Similar  types  of  artihcial- viscosity  hlters  have  been  traditionally  used  in  simula¬ 
tions  of  supersonic  and  transonic  how  (Peyret&Taylor  [38]).  The  idea  of  artihcial- 
viscosity  hlters  goes  back  to  Richtmeyer&Morton  [43]  and  perhaps  earlier.  However, 
a  theoretical  analysis  of  such  hlters  is  lacking,  as  far  as  I  know.  The  analysis  presented 
below  is  a  hrst  step  towards  a  better  understanding  of  artihcial- viscosity  hlters. 

5.1  Evidence  of  high-frequency  oscillations 

One  of  the  difhculties  of  simulating  subsonic  compressible  how  is  the  appearance  of 
slow-growing  high-frequency  oscillations  in  the  computed  solution.  These  oscillations 
persist  for  a  long  time  before  they  eventually  overwhelm  the  solution  and  cause  an 
exponential  blow-up.  The  spatial  wavelength  of  the  oscillations  is  of  the  order  of  the 
mesh  size  Ax.  The  conditions  that  seem  to  trigger  the  oscillations  include  impulsive 
changes  of  density,  high  speed  how,  and  small  viscosity,  high  Reynolds  number  how. 
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Flow  examples  include  the  uniform  flow  past  a  sharp  obstacle  at  high  speed,  and  a 
jet  of  air  impinging  the  labium  of  a  flue  pipe  (see  hgures  5-1  and  5-1). 


Figure  5-1:  Iso-density  contours  in  the  flue-labium  region,  mean  blowing  velocity 
1104  cm/s.  High  spatial  frequencies  cause  instabilities  if  left  un-treated. 

Figures  5-1  and  5-1  show  snapshots  of  the  density  in  simulations  which  would 
become  unstable  without  the  use  of  an  artihcial- viscosity  hlter.  In  particular,  the  flue- 
labium  region  of  a  flue  pipe  is  shown.  The  lattice  Boltzmann  method  is  used  together 
with  a  fourth-order  artihcial- viscosity  hlter  with  a  =  0.008  (explained  below).  Iso¬ 
density  contours  are  plotted,  and  also  a  horizontal  cut  of  the  density  is  shown  at 
the  top  of  the  picture.  The  horizontal  cut  starts  from  the  bottom  surface  of  the  hue 
channel,  and  continues  parallel  to  and  under  the  labium.  High-frequency  variations  of 
density  can  be  seen  at  the  region  between  the  hue  and  the  labium  in  both  simulations. 
Such  high-frequency  disturbances  can  cause  instabilities  if  left  untreated. 
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Figure  5-2:  Iso-density  contours  in  the  flue-labium  region,  mean  blowing  velocity 
1995  cm/s.  High  spatial  frequencies  cause  instabilities  if  left  un-treated. 

5.2  The  fourth-order  filter 


The  high-frequency  oscillatory  instabilities  can  be  mitigated  by  using  a  fourth-order 
artihcial- viscosity  hlter  as  follows, 


(5.1) 


y  dx^  dy^  ) 

The  above  hlter  is  applied  at  the  end  of  every  integration  step  to  all  three  variables 
/0, 14,K/.  The  parameter  a  controls  the  dissipation  of  the  hlter.  In  the  case  of  the 
lattice  Boltzmann  method,  a  typical  value  of  a  is  a  =  0.008.  In  the  case  of  the 
compressible  hnite  difference  of  section  3.3,  a  larger  value  a  is  used,  typically  a  = 
0.015,  because  the  hnite  difference  method  is  more  sensitive  to  instabilities  than  the 
lattice  Boltzmann  method.  If  a  is  too  large,  the  solution  is  distorted  (incorrect 
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physical  modeling)  and  may  even  become  unstable.  According  to  a  linear  stability 
analysis  which  is  described  below,  the  largest  value  of  a  for  stable  2D  calculations  is 
1/16,  namely  a  <  0.0625.  However,  a  should  be  less  than  1/32  to  produce  a  desirable 
hlter  (see  hgure  5-3).  In  practice,  even  smaller  values  of  a  are  recommended  (near 
0.01)  to  avoid  distorting  the  solution. 

The  discretization  of  the  fourth-order  hlter  in  the  x-direction  is  as  follows, 

A-V  =  -  4 1-' _,  +  6 1-'  -  4  +  V+2  (5.2) 

The  above  discretization  is  used  at  all  the  interior  points  which  are  at  least  2  grid 
points  away  from  the  boundary.  At  the  boundary  points  and  at  the  next-to-boundary 
points,  the  above  hlter  can  not  be  applied  for  obvious  reasons. 

In  order  to  hlter  the  nodes  near  the  boundary  in  a  consistent  way,  a  third-order 
differencing  formula  must  be  used  at  the  next-to-boundary  points  as  follows, 

V--+1  =  vp  +  o  (V)_,  -  3  \U  +  3  V)  -  V)„)  (5.3) 

where  the  small  index  is  j  =  J  —  1  and  the  capital  index  J  corresponds  to  the 
boundary  point.  Similar  formulas  must  be  used  for  the  other  boundary  orientations. 
If  the  above  formula  is  not  used,  stability  problems  may  arise  at  the  boundary. 

A  simple  way  to  understand  and  to  derive  formula  5.3  is  to  consider  the  global 
conservation  of  the  how  (total  change  in  p,  14,  K/)  the  hlter  has  been  applied.  To 
do  so,  the  contributions  of  the  hlter  must  be  summed  at  each  grid  point.  For  example, 
the  total  contribution  of  the  hlter  at  an  interior  point  is  zero:  As  the  fourth-order 
stencil  (equation  5.2)  is  shifted  along  the  x-direction,  an  interior  point  is  multiplied 
by  each  one  of  the  hve  “peaks”  of  the  fourth-order  stencil  before  being  added-in,  so 
that  the  total  sum  is  zero.  By  contrast,  the  total  contribution  of  the  fourth-order 
stencil  at  points  near  the  boundary  (Vj  to  Vj-s)  is  generally  non-zero.  The  third- 
order  differencing  formula  adds-in  the  necessary  corrections  to  make  the  total  sum 
vanish,  so  that  the  hlter  obeys  global  conservation. 
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The  third-order  differencing  formula  at  the  next-to-boundary  nodes  plays  an  im¬ 
portant  role  in  the  parallel-computing  approach  described  in  chapter  6.  Normally, 
the  boundaries  of  a  simulation  include  the  interior  obstacles,  and  the  perimeter  that 
encloses  the  simulated  region.  In  parallel  simulations,  additional  boundaries  arise 
because  the  global  simulation  region  is  divided  into  subregions  which  are  computed 
in  parallel.  The  crossing  between  two  subregions  is  a  kind  of  “artificial”  bound¬ 
ary.  Applying  the  fourth-order  filter  at  the  artificial  boundary  would  require  a  lot 
of  communication  between  the  subregions  (the  fourth-order  stencil  requires  two  next 
neighbors).  To  save  on  communication,  the  fourth-order  filter  is  not  applied  at  the 
‘artificial”  boundary.  However,  the  third-order  formula  must  be  used,  instead,  for 
consistency.  The  author  actually  discovered  the  need  for  the  third-order  formula  by 
noticing  a  slow-growing  instability  at  the  artificial  boundary  of  a  parallel  simulation. 


5.3  Analysis  of  fourth-order  filter 


The  fourth-order  filter  can  be  understood  by  considering  the  dissipation  of  frequencies 
by  a  general  m  th-order  filter, 

Q-myn 


y 


n-\-l  _  T/’^ 


=  -  a 


dx' 


(5,4) 


The  analysis  here  treats  the  filter  as  an  isolated  system  without  considering  the  cou¬ 
pling  between  the  filter  and  the  numerical  solution.  We  write  V  in  terms  of  spatial 
frequencies  k, 

_  ^IKX 

(5.5) 

yn+l  _  Q^iKX 

where  G  is  the  growth  factor,  and  the  range  of  frequencies  is  0  <  k  <  tt j Ax.  By 
substituting  equation  5.5  in  equation  5.4,  we  obtain  an  estimate  for  the  growth  factor 
G  of  the  m  th-order  filter. 


G  =  l-ra{KAx)"^ 


(5.6) 
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Here,  the  continuous  version  of  the  hlter  is  considered  for  simplicity.  The  discretiza¬ 
tion  of  the  hlter  is  discussed  below.  We  have  the  following  cases, 

m  =  2  G  =  1  -\-  a{nA.xY  a  <  0 

m  =  3  G  =  1  -\-  ia{nA.xY  unstable  ? 

m  =  4  G  =  1  —  a{KAxY  a  >  0  (5-7) 

m  =  6  G  =  1  -\-  a{KAxY  a  <  0 

m  =  8  G  =  1  —  a{KAxY  a  >  0 

The  case  m  =  2  corresponds  to  physical  viscosity,  and  therefore,  it  can  not  be  used 
for  artihcial- viscosity  hltering.  The  case  m  =  3  appears  to  amplify  all  frequencies  for 
any  choice  of  a,  and  furthermore  the  frequencies  are  phase-shifted  disproportionately. 
Clearly,  m  =  3  is  not  a  desirable  hlter,  and  similar  conclusions  hold  for  any  odd  integer 
m.  The  even  integers  m  are  suitable  for  hltering,  and  the  smallest  possible  integer 
m  =  4  corresponds  to  a  fourth-order  hlter. 

In  comparing  the  even  power  hlters,  we  may  observe  that  the  sign  of  a  must  alter¬ 
nate  with  increasing  m  =  2,  4,  6, ...  in  order  to  produce  a  dissipative  hlter.  Also,  larger 
values  of  m  produce  “sharper”  hlters.  A  sharp  hlter  means  that  the  low  frequencies 
are  affected  very  little,  and  the  high  frequencies  kAx  ~  tt  are  strongly  dissipated,  and 
that  the  transition  (cutoff)  point  is  very  abrupt.  Finally,  we  may  observe  that  the 
stability  constraints  on  a  become  more  stringent  with  increasing  m.  In  particular, 
the  condition  IG*!  <  1  requires  (for  the  continuous  hlter), 

2  2 

“  (kAx)™  “  (tt)™  ^  ^ 

The  fourth-order  hlter  m  =  4  is  a  good  choice  because  it  has  the  desirable  hltering 
behavior  as  shown  below  in  more  detail,  and  also  because  m  =  4  is  the  smallest 
possible  integer.  The  size  of  m  is  proportional  to  the  computational  cost  of  the  hlter, 
assuming  that  the  hlter  is  implemented  via  hnite  differences. 

The  discretization  of  the  fourth-order  hlter  based  on  symmetric  differences  is  given 
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Figure  5-3:  Amplification  of  spatial  frequencies  by  the  fourth-order  artihcial- viscosity 
hlter  (2D  discretized)  for  different  values  of  a. 


by  equation  5.2,  and  it  produces  the  following  growth  factor, 

=  1  —  a  (6  —  8  cos /€ Ax  +  2  cos  2/€ Ax)  (5-9) 

=  1  —  4  a  (1  —  cos  /€ Ax)^ 

For  stability  purposes,  the  magnitude  of  the  growth  factor  must  be  less  than  one. 


-  1  <  G  <  1 

Using  the  largest  possible  frequency  kAx  =  tt  we  obtain, 

1 

0  <  a  <  - 
-  -  8 

In  two  dimensions,  it  is  easy  to  see  that  the  growth  factor  becomes. 


G=l— 4q;  (1—  cos  KiAxY  +  (1  —  cos  ^2  Aj/)^ 


which  implies  the  following  limits  on  a. 


(5.10) 


(5.11) 


(5.12) 


0  <  Q  <  — 

“  -  16 


(5.13) 
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The  growth  factor  of  equation  5.12  is  plotted  in  hgure  5-3  for  different  values  of  a. 
We  can  see  that  the  maximum  value  for  stability  a  =  1/16  produces  an  undesirable 
hlter  because  the  very  high  frequencies  are  simply  multiplied  by  a  minus  sign  and  are 
not  dissipated.  For  a  desirable  hlter,  a  should  be 


a  <  — 
-  32 


(5.14) 


In  practice,  even  smaller  values  of  a  are  preferred  in  order  to  prevent  distortion  of  the 
solution.  For  example,  the  value  a  =  0.008  produces  very  small  dissipation  of  very 
high  frequencies  only.  This  small  dissipation  is  needed  in  order  to  avoid  the  high- 
frequency  numerical  oscillations  which  appear  in  simulations  of  subsonic  compressible 
how. 


5.4  Other  kinds  of  filters 


The  frequency  analysis  presented  above  can  be  continued  in  order  to  understand 
further  the  artihcial- viscosity  hlters.  To  this  end,  the  shift  operators  Wi  and  S'+i  are 
introduced,  and  they  look  as  follows  in  the  frequency  domain, 

S-xV  = 


h'+il/  = 

A  second-order  symmetric  differencing  formula  can  be  written  as  follows. 


(5.15) 


{AxY' 8xxV  =  (A-i  —  2  +  A+i)  y  =  — 2  (1  —  cos  kAx)  y 


(5.16) 


The  discretization  of  an  m  th-order  hlter  for  even  m  =  21  based  on  symmetric  differ¬ 
ences  can  be  found  by  applying  /-times  the  above  second-order  difference  operator, 

yn+i  ^  yn  _  a{Axf^  {8xx)^V'^  (5.17) 


The  growth  factor  is  as  follows  (for  a  one-dimensional  hlter), 

G  =  1  —  a  (—2)^  (1  —  cos  kAx^ 


(5.18) 
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The  above  expression  is  a  generalization  of  the  fourth-order  formula  obtained  previ¬ 
ously  for  m  =  4  or  /  =  2. 

The  frequency  analysis  can  also  be  applied  to  “tophat”  averaging  hlters  in  the 
context  of  fluid  flow  simulations.  When  high  frequencies  must  be  removied,  tophat 
averaging  is  the  hrst  idea  that  comes  to  mind.  For  example,  a  two-point  averaging 
formula  is  as  follows, 


yn+l 


(S-i  +  S+i) 
2 


(5.19) 


The  above  hlter  is  undesirable  because  it  causes  signihcant  dissipation  of  low  frequen¬ 
cies  as  well  as  high  frequencies,  and  also  because  it  causes  phase  distortion  as  can  be 
seen  from  the  imaginary  part  of  the  growth  factor  (1  +  e*^^®)/2.  Thus,  we  may  try 
a  three-point  averaging  formula. 


yn+l 


(S-i  +  1  +  S+i) 
3 


(5.20) 


The  growth  factor  is  (1  +  2  cos  /€Aa;)/3,  and  has  no  imaginary  components  which  is 
good.  However,  the  high  frequencies  kAx  >  7r/3  are  multiplied  by  a  minus  sign 
and  are  not  dissipated  completely.  For  example,  the  highest  frequency  kAx  =  tt  is 
multiplied  by  —1/3.  The  3-point  averaging  hlter  can  be  improved  by  considering  a 
weighted  3-point  averaging, 

yn+l  ^  y- (1  _  Q,)  y  Q,  0^zi±l±l^  y-  (5.21) 

3 


The  above  expression  is  actually  equivalent  to  a  second-order  viscosity  hlter  as  can 
be  seen  by  rewriting  it  as  follows, 

yn  +  l  ^  yn  y  ^  (Wi  -2  +  6'+i) 

3 


Clearly,  a  weighted  3-point  averaging  hlter  is  undesirable  because  it  affects  the  phys¬ 
ical  viscosity.  Furthermore,  it  is  easy  to  see  that  the  4-point,  6-point,  8-point,  etc 
averaging  hlters  produce  undesirable  phase  distortion.  ^  Therefore,  the  smallest  viable 


discussion  of  phase  distortion  according  to  an  Electrical  Engineering  textbook  can  be  found 
in  Siebert  [47,  p.472]. 
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choice  is  a  5-point  averaging  filter.  The  general  form  of  a  weighted  5-point  averaging 
hlter,  which  does  not  cause  phase  distortion,  can  be  written  as  follows  where  /3,7  are 
real-numbers  (weighting  factors), 

V-"«  =  F"  (1  -  «)  +  a  +  Tg-i  +  1  +  flgti  +  7S^2  ^ 

The  fourth-order  artihcial-viscosity  hlter  of  equation  5.2  is  a  special  case  of  the  above 
expression.  This  analysis  puts  in  perspective  the  fourth-order  artihcial-viscosity  hlter, 
and  shows  why  the  2-point,  3-point,  and  4-point  averaging  hlters  do  not  perform  well 
in  huid  how  simulations,  a  fact  which  can  be  easily  tested  in  actual  simulations. 


5.5  The  origin  of  high-frequency  oscillations 

The  origin  of  the  slow-growing  high-frequency  numerical  oscillations  in  simulations 
of  compressible  how  is  not  well  understood.  It  is  possible  that  the  triggering  of  the 
oscillations  is  both  numerical  and  physical.  Peyret&Taylor  [38,  p.323]  report  that 
high-frequency  oscillations  appear  both  in  explicit  and  implicit  methods  for  transonic 
and  supersonic  compressible  how,  which  hints  that  there  may  be  a  physical  cause  that 
triggers  the  oscillations. 

It  has  been  conjectured  (Fletcher  [18,  p.438]  and  elsewhere)  that  physical  tur¬ 
bulence  may  be  triggering  the  numerical  oscillations.  Turbulent  how  produces  high 
frequency  disturbances  whose  wavelength  is  much  smaller  than  the  limited  resolu¬ 
tion  of  computer  simulations.  Accordingly,  it  has  been  conjectured  that  a  type  of 
frequency  aliasing  may  be  happening  from  the  turbulent  length  scales  to  the  coarser 
length  scales  of  the  simulation.  However,  the  details  of  such  a  mechanism  have  never 
been  shown,  and  they  are  not  obvious.  In  particular,  the  algebraic  system  of  differ¬ 
ence  equations  (the  simulation)  is  not  a  sampling  process  of  the  underlying  differential 
equations  of  huid  how.  Perhaps,  a  more  plausible  conjecture  is  that  the  discrete  sys¬ 
tem  of  equations  inherits  a  tendency  for  a  kind  of  “discrete  turbulence”  from  the 
continuous  equations  of  huid  how. 
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A  related  point  is  that  physical  turbulence  provides  a  mechanism  for  dissipating 
very  high-frequency  oscillations.  This  is  the  energy  cascade  idea:  the  energy  of  the 
flow  cascades  from  large  scale  motion  to  smaller  and  smaller  vortices  until  being 
dissipated.  Perhaps,  the  turbulent  dissipation  can  be  compared  with  the  fourth- 
order  artihcial- viscosity  dissipation.  This  idea  is  the  reason  why  fourth-order  artihcial 
viscosity  is  sometimes  referred  to  as  a  model  of  subgrid  turbulence.  However,  a  lot  of 
work  remains  to  be  done  to  understand  how  good  (or  how  bad)  a  model  of  subgrid 
turbulence  is  the  fourth-order  artihcial  viscosity. 

This  chapter  completes  the  basic  discussion  of  numerical  methods  and  numerical 
modeling.  In  the  next  chapter,  the  parallel  computation  of  huid  dynamics  is  discussed. 
Subsequently,  in  chapter  7  examples  of  simulations  of  hue  pipes  are  presented  which 
complement  the  simulations  already  presented  in  chapter  1. 


Chapter  6 

Parallel  Computing 

6.1  Introduction 

This  chapter  presents  an  effective  approach  of  simulating  fluid  dynamics  on  a  cluster 
of  non-dedicated  workstations.  Concurrency  is  achieved  by  decomposing  the  flow 
problem  into  subregions,  and  by  assigning  the  subregions  to  parallel  subprocesses. 
The  use  of  explicit  numerical  methods  leads  to  small  communication  requirements. 
The  parallel  subprocesses  automatically  migrate  from  busy  hosts  to  free  hosts  in  order 
to  exploit  the  unused  cycles  of  non-dedicated  workstations,  and  to  avoid  disturbing 
the  regular  users.  The  system  is  straightforwardly  implemented  on  top  of  UNIX  and 
TCP/IP  communication  routines. 

Typical  simulations  achieve  80%  parallel  efficiency  (speedup/processors)  using  20 
HP-Apollo  workstations  in  a  cluster  where  there  are  25  non-dedicated  workstations 
total.  Detailed  measurements  of  efficiency  in  simulating  two  and  three-dimensional 
flows  are  presented,  and  a  theoretical  model  of  efficiency  is  developed  which  hts 
closely  the  measurements.  Two  numerical  methods  of  fluid  dynamics  are  tested:  hnite 
differences  and  the  lattice  Boltzmann  method.  Further,  it  is  shown  that  the  shared- 
bus  Ethernet  network  is  adequate  for  two-dimensional  simulations  of  fluid  dynamics, 
but  limited  for  three-dimensional  ones.  It  is  expected  that  new  technologies  in  the 
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Figure  6-1:  Simulation  of  flue  pipe  using  20  workstations  in  5  X  4  decomposition. 


near  future  such  as  Ethernet  switches,  FDDI  and  ATM  networks  will  make  practical 
three-dimensional  simulations  of  fluid  dynamics  on  a  cluster  of  workstations. 

The  parallel  system  presented  here  is  well-suited  for  simulating  subsonic  flows 
which  involve  both  hydrodynamics  and  acoustic  waves;  for  example,  the  flow  of  air 
inside  wind  musical  instruments.  Such  flow  problems  favor  the  use  of  explicit  methods 
(see  section  3.2)  which  are  perfectly  parallelizable,  and  lead  to  low  communication 
requirements  between  parallel  processes.  The  use  of  explicit  methods  is  important  for 
parallel  computing  on  a  cluster  of  workstations  because  the  communication  capacity 
between  workstations  is  usually  small. 

In  general,  the  use  of  explicit  methods  is  recommended  in  situations  where  in¬ 
creasing  numbers  of  local  processing  units  are  available  with  minimum  communica- 
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lion  capacity  between  the  processing  units.  Such  computers  may  be  widespread  in 
the  future;  for  instance,  a  future  parallel  computer  may  consist  of  millions  of  local 
processing  units,  each  unit  having  the  power  of  one  of  today’s  workstations.  With  this 
perspective  in  mind,  the  work  presented  herein  for  a  cluster  of  20  to  25  workstations, 
may  have  applications  for  future  parallel  computers  as  well. 

Outline 

Section  6.2  presents  some  examples  of  parallel  simulations  which  demonstrate  the 
power  of  the  present  approach,  and  also  help  to  motivate  the  subsequent  sections. 
Section  6.3  reviews  parallel  computing  and  local-interaction  problems  in  general. 
Sections  6.4  and  6.5  describe  the  implementation  of  the  parallel  simulation  system, 
including  the  automatic  migration  of  processes  from  busy  hosts  to  free  hosts.  Sec¬ 
tion  6.6  explains  the  parallelization  of  numerical  methods  for  fluid  dynamics.  Finally, 
sections  6.7  and  6.8  measure  experimentally  the  performance  of  the  parallel  system, 
and  also  develop  a  theoretical  model  of  parallel  efficiency  for  local-interaction  prob¬ 
lems  which  hts  well  the  measured  efficiency. 

Most  issues  are  discussed  as  generally  as  possible  within  the  context  of  local- 
interaction  problems,  and  the  specihcs  of  fluid  dynamics  are  limited  to  section  6.2 
and  section  6.6. 


6.2  Examples  of  distributed  simulations 

The  parallel  simulation  system  is  used  to  simulate  subsonic  flow,  and  in  particular, 
the  flow  of  air  inside  flue  pipes  of  wind  musical  instruments  such  as  the  organ,  the 
recorder,  and  the  flute.  This  is  a  phenomenon  that  involves  the  interaction  between 
hydrodynamic  flow  and  acoustic  waves:  When  a  jet  of  air  impinges  a  sharp  obstacle  in 
the  vicinity  of  a  resonant  cavity,  the  jet  begins  to  oscillate  strongly,  and  it  produces 
audible  musical  tones.  The  jet  oscillations  are  reenforced  by  a  nonlinear  feedback 
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from  the  acoustic  waves  to  the  jet.  Similar  phenomena  occur  in  human  whistling 
and  in  voicing  of  fricative  consonants  (Shadle  [46]).  Although  sound-producing  jets 
have  been  studied  for  more  than  a  hundred  years,  they  remain  the  subject  of  active 
research  (Verge94  [57,  56],  Hirschberg  [26])  because  they  are  very  complex. 

The  parallel  system  presented  herein  can  easily  simulate  flue  pipes  using  uniform 
orthogonal  grids  as  large  as  1200  X  1200  in  two  dimensions  (1.5  million  nodes)  and 
even  larger.  Typically,  smaller  grids  are  employed,  however,  such  as  800  X  500  (0.38 
million  nodes)  in  order  to  reduce  the  computing  time.  For  example,  if  we  divide 
a  800  X  500  grid  into  twenty  subregions  and  assign  each  subregion  to  a  different 
HP9000/700  workstation,  we  can  compute  70,000  integration  steps  in  12  hours  of  run 
time.  This  produces  about  12  milliseconds  of  simulated  time,  which  is  long  enough 
to  observe  the  initial  response  of  a  flue  pipe  with  a  jet  of  air  that  oscillates  at  1000 
cycles  per  second. 

Figure  6-1  shows  a  snapshot  of  a  800  X  500  simulation  of  a  flue  pipe  by  plotting 
equi-vorticity  contours  (the  curl  of  fluid  velocity).  The  decomposition  of  the  two- 
dimensional  space  (5x4)  =  20  is  shown  as  dashed  lines  superimposed  on  top  of  the 
physical  region.  The  gray  areas  are  walls,  and  the  dark-gray  areas  are  walls  that 
enclose  the  simulated  region  and  demarcate  the  inlet  and  the  outlet.  The  jet  of  air 
enters  from  an  opening  on  the  left  wall,  impinges  the  sharp  edge  in  front  of  it,  and 
it  eventually  exits  from  the  simulation  through  the  opening  on  the  right  part  of  the 
picture.  The  resonant  pipe  is  located  at  the  bottom  part  of  the  picture. 

Figure  6-2  shows  a  snapshot  of  another  simulation  that  uses  a  slightly  different 
geometry  than  hgure  6-1.  In  particular,  hgure  6-2  includes  a  long  channel  through 
which  the  jet  of  air  must  pass  before  impinging  the  sharp  edge.  Also,  the  outlet  of 
the  simulation  is  located  at  the  top  of  the  picture  as  opposed  to  the  right.  This  is 
convenient  because  the  air  tends  to  move  upwards  after  impinging  the  sharp  edge. 
Overall,  hgure  6-2  is  a  more  realistic  model  of  hue  pipes  than  hgure  6-1. 

From  a  computational  point  of  view  the  geometry  of  hgure  6-2  is  interesting  be- 
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Figure  6-2:  Simulation  of  a  flue  pipe  using  15  workstations  in  6  X  4  decomposition 
with  9  subregions  inactive. 

cause  there  are  subregions  that  are  entirely  gray,  i.e.  they  are  entirely  solid  walls. 
Consequently,  these  subregions  need  not  be  assigned  to  any  workstation.  Thus,  al¬ 
though  the  decomposition  is  (6  X  4)  =  24  ,  only  15  workstations  are  employed  for  this 
problem.  In  terms  of  the  number  of  grid  nodes,  the  full  rectangular  grid  is  1107  X  700 
or  0.7  million  nodes,  but  only  15/24  of  the  total  nodes  or  0.48  million  nodes  are 
simulated.  This  example  shows  that  an  appropriate  decomposition  of  the  problem 
can  reduce  the  computational  effort  in  some  cases,  as  well  as  provide  opportunities 
for  parallelism.  More  sophisticated  decompositions  can  be  even  more  economical 
than  the  present  ones.  Uniform  decompositions  and  identical-shaped  subregions  are 
employed  here  because  they  are  very  simple. 

The  above  simulations  have  been  performed  using  the  lattice  Boltzmann  method. 
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Figure  6-3:  A  problem  of  local  interactions  in  two  dimensions,  and  its  decomposition 
(2  X  2)  into  four  subregions. 

Similar  results  are  obtained  using  a  finite  difference  approach.  Further  issues  on 
parallelization  of  fluid  dynamics  are  discussed  in  section  6.6.  Next,  the  basics  of  local- 
interaction  problems  are  reviewed,  and  the  implementation  of  the  parallel  system  is 
described.  These  issues  are  important  for  understanding  in  detail  how  the  parallel 
system  works  and  why  it  works  well. 

6.3  Local-interaction  computations 

We  dehne  a  local-interaction  computation  as  a  set  of  “parallel  nodes”  that  can  be 
positioned  in  space  so  that  the  nodes  interact  only  with  neighboring  nodes.  For  exam¬ 
ple,  hgure  6-3  shows  a  two-dimensional  space  of  parallel  nodes  which  are  connected  by 
solid  lines  representing  the  local  interactions.  In  this  example,  the  interactions  extend 
to  a  distance  of  one  neighbor,  and  have  the  shape  of  a  star  stencil,  but  other  patterns 
of  local  interactions  are  also  possible.  Figure  6-4  shows  two  typical  interactions  which 
extend  to  a  distance  of  one  neighbor,  a  star  stencil  and  a  full  stencil. 

The  parallel  nodes  of  a  local-interaction  problem  are  the  hnest  grain  of  parallelism 
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that  is  available  in  the  problem;  namely,  they  are  the  hnest  decomposition  of  the 
problem  into  units  that  can  evolve  in  parallel  after  communication  of  information  with 
their  neighbors.  In  practice,  the  parallel  nodes  are  usually  grouped  into  subregions 
of  nodes,  as  shown  in  hgure  6-3  by  the  dashed  lines.  Each  subregion  is  assigned  to  a 
different  processor,  and  the  problem  is  solved  in  parallel  by  executing  the  following 
sequence  of  steps  repeatedly, 

•  Calculate  the  new  state  of  the  interior  of  the  subregion  using  the  previous  history 
of  the  interior  as  well  as  the  current  boundary  information  from  the  neighboring 
subregions. 

•  Communicate  boundary  information  with  the  neighboring  subregions  in  order 
to  prepare  for  the  next  local  calculation. 

The  boundary  that  is  communicated  between  subregions  is  the  outer  surface  of  the 
subregions.  Section  6.4.2  describes  a  good  way  of  organizing  this  communication. 

Local-interaction  problems  are  highly-suited  for  parallel  computing  because  the 
communication  is  local,  and  also  because  the  amount  of  communication  relative  to 
computation  can  be  controlled  by  varying  the  decomposition.  In  particular,  when 
each  subregion  is  as  small  as  one  node  (one  processor  per  node),  there  is  maximum 
parallelism,  and  a  lot  of  communication  relative  to  the  computation  of  each  processor. 
As  the  size  of  each  subregion  increases  (which  is  called  “coarse-graining”),  both  the 
parallelism  and  the  the  amount  of  communication  relative  to  computation  decrease. 
This  is  because  only  the  surface  of  a  subregion  communicates  with  other  subregions. 
Eventually,  when  one  subregion  includes  all  the  nodes  in  the  problem,  there  is  no 
parallelism  and  no  need  for  communication  anymore.  Somewhere  between  these  ex¬ 
tremes,  we  often  hud  a  good  match  between  the  size  of  the  subregion  (the  “parallel 
grain  size”)  and  the  communication  capabilities  of  the  computing  system.  This  is 
the  reason  why  local-interaction  problems  are  very  flexible  and  highly  desirable  for 
parallel  computing. 
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Figure  6-4:  A  star  stencil  and  a  full  stencil  represent  two  typical  nearest  neighbor 
local  interactions. 

6.4  The  distributed  system 

The  design  of  the  parallel  system  follows  the  basic  ideas  of  local-interaction  parallel 
computing  that  are  discussed  above.  This  section  describes  the  implementation  of 
the  parallel  system,  which  is  based  on  UNIX  and  TCP/IP  communication  routines, 
and  exploits  the  common  hie  system  of  the  workstations. 

6.4.1  The  main  modules 

For  the  sake  of  programming  modularity,  the  parallel  simulation  system  is  organized 
into  the  following  four  modules: 

•  The  initialization  program  produces  the  initial  state  of  the  problem  to  be  solved 
as  if  there  was  only  one  workstation. 

•  The  decomposition  program  decomposes  the  initial  state  into  subregions,  gen¬ 
erates  local  states  for  each  subregion,  and  saves  them  in  separate  hies,  called 
“dump  hies”.  These  hies  contain  all  the  information  that  is  needed  by  a  work¬ 
station  to  participate  in  a  distributed  computation. 

•  The  job-submit  program  hnds  free  workstations  in  the  cluster,  and  begins  a 
parallel  subprocess  on  each  workstation.  It  provides  each  process  with  a  dump 
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file  that  specifies  one  subregion  of  the  problem.  The  processes  execute  the  same 
program  on  different  data. 

•  The  monitoring  program  runs  every  few  minutes  and  checks  that  the  parallel 
processes  are  progressing  correctly.  If  an  unrecoverable  error  occurs,  the  dis¬ 
tributed  simulation  is  stopped,  and  a  new  simulation  is  started  from  the  last 
state  which  is  saved  automatically  every  10  —  20  minutes.  If  a  workstation 
becomes  too  busy,  automatic  migration  of  the  affected  process  takes  place,  as 
explained  in  section  6.5. 

All  of  the  above  programs  (initialization,  decomposition,  submit,  and  monitoring)  are 
performed  by  one  designated  workstation  in  the  cluster.  Although  it  is  possible  to 
perform  the  initialization  and  the  decomposition  in  a  distributed  fashion  in  principle, 
a  serial  approach  is  chosen  here  for  simplicity. 

Regarding  the  selection  of  free  workstations,  the  strategy  is  to  separate  all  the 
workstations  into  two  groups:  workstations  with  active  users,  and  workstations  with 
idle  users  (meaning  more  than  20  minutes  idle  time).  An  idle-user  does  not  necessarily 
imply  an  idle  workstation  because  background  jobs  may  be  running;  however,  an  idle- 
user  is  preferred  to  an  active  user.  Thus,  the  idle-user  workstations  are  examined  hrst 
to  see  if  the  hfteen-minute  average  of  the  CPU  load  is  below  a  pre-set  value,  in  which 
case  the  workstation  is  selected.  For  example,  the  load  must  be  less  than  0.6  where 
1.0  means  that  a  full-time  process  is  running  on  the  workstation.  After  examining 
the  idle-user  workstations,  the  active-user  workstations  are  examined,  and  the  search 
continues  as  long  as  more  workstations  are  needed. 

In  addition  to  the  above  programs  (initialization,  decomposition,  submit,  and 
monitoring),  there  is  also  the  program  that  is  executed  in  parallel  by  all  the  work¬ 
stations.  This  program  consists  of  two  steps:  “compute  locally”,  and  “communicate 
with  neighbors”.  Below  we  discuss  issues  relating  to  communication. 
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6.4.2  Communication 

The  communication  between  parallel  processes  synchronizes  the  processes  in  an  in¬ 
direct  fashion  because  it  encourages  the  processes  to  begin  each  computational  cycle 
together  with  their  neighbors  as  soon  as  they  receive  data  from  their  neighbors. 
Thus,  there  is  a  local  near-synchronization  which  also  encourages  a  global  near¬ 
synchronization.  However,  neither  local  nor  global  synchronization  is  guaranteed, 
and  in  special  circumstances  the  parallel  processes  can  be  several  integration  time 
steps  apart.  This  is  important  when  a  process  migrates  from  a  busy  host  to  a  free 
host,  as  explained  in  section  6.5  (also  see  the  appendix). 

The  communication  of  data  between  processes  is  organized  by  means  of  a  well- 
known  programming  technique  which  is  called  “padding”  or  “ghost  cells”  (Fox  [19], 
Camp  [6]).  Specihcally,  each  subregion  is  padded  with  one  or  more  layers  of  extra 
nodes  on  the  outside.  One  layer  of  nodes  is  used  if  the  local  interaction  extends  to 
a  distance  of  one  neighbor,  and  more  layers  are  used  if  the  local  interaction  extends 
further.  Once  the  data  is  copied  from  one  subregion  onto  the  padded  area  of  a 
neighboring  subregion,  the  boundary  values  are  available  locally  during  the  current 
cycle  of  the  computation.  This  is  a  good  way  to  organize  the  communication  of 
boundary  values  between  neighboring  subregions. 

In  addition,  padding  leads  to  programming  modularity  in  the  sense  that  the  com¬ 
putation  does  not  need  to  know  anything  about  the  communication  of  the  boundary. 
As  long  as  we  compute  within  the  interior  of  each  subregion,  the  computation  can 
proceed  as  if  there  was  no  communication  at  all.  Because  of  this  separation  between 
computation  and  communication,  it  is  possible  to  develop  a  parallel  program  as  a 
straightforward  extension  of  a  serial  program.  In  the  present  system,  the  fluid  dy¬ 
namics  code  can  be  compiled  either  into  a  parallel  program  or  into  a  serial  program 
depending  on  the  settings  of  a  few  C-compiler  directives.  The  main  differences  be¬ 
tween  the  parallel  and  the  serial  programs  are  the  padded  areas,  and  a  subroutine 
that  communicates  the  padded  areas  between  processes. 
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The  subroutine  that  communicates  the  padded  areas  between  processes  is  imple¬ 
mented  using  “sockets”  and  the  TCP/IP  protocol.  A  socket  is  an  abstraction  in  the 
UNIX  operating  system  that  provides  system  calls  to  send  and  receive  data  between 
UNIX  processes  on  different  workstations.  A  number  of  different  protocols  (types  of 
behavior)  are  available  with  sockets,  and  TCP/IP  is  the  simplest  one.  This  is  because 
the  TCP/IP  protocol  guarantees  delivery  of  any  messages  sent  between  two  processes. 
Accordingly,  the  TCP/IP  protocol  behaves  as  if  there  are  two  first-in-first-out  chan¬ 
nels  for  writing  data  in  each  direction  between  two  processes.  Also,  once  a  TCP/IP 
channel  is  opened  at  startup,  it  remains  open  throughout  the  computation  except 
during  migration  when  it  must  be  re-opened,  as  explained  later. 

Opening  the  TCP/IP  channel  involves  a  simple  hand-shaking,  “I  am  listening  at 
this  port  number.  I  want  to  talk  to  you  at  this  port  number?  Okay,  the  channel  is 
open.”  The  port  numbers  are  needed  to  identify  uniquely  the  sender  and  the  recipient 
of  a  message  so  that  messages  do  not  get  mixed  up  between  different  UNIX  processes. 
Further,  the  port  numbers  must  be  known  in  advance  before  the  TCP/IP  channel  is 
opened.  Thus,  each  process  must  first  allocate  its  port  numbers  for  listening  to  its 
neighbors,  and  then  write  the  port  numbers  into  a  shared  file.  The  neighbors  must 
read  the  shared  file  before  they  can  connect  using  TCP/IP. 

6.5  Transparency  to  other  users 

The  basic  operation  of  the  parallel  simulation  system  was  described  in  the  previous 
section.  Here,  the  issues  that  arise  when  sharing  the  workstations  with  other  users 
are  discussed.  Specifically,  there  are  two  issues  to  consider:  sharing  the  CPU  cycles  of 
each  workstation,  and  sharing  the  local-area  network  and  the  file  server.  The  sharing 
of  CPU  cycles  is  achieved  by  employing  an  automatic  migration  of  processes  from 
busy  hosts  to  free  hosts  as  explained  below. 
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6.5.1  Automatic  migration  of  processes 

The  utilization  of  a  workstation  can  be  distinguished  into  three  basic  categories: 

•  (i)  The  workstation  is  idle. 

•  (ii)  The  workstation  is  running  an  interactive  program  that  requires  fast  CPU 
response  and  few  CPU  cycles. 

•  (iii)  The  workstation  is  running  another  full-time  process  in  addition  to  a  par¬ 
allel  subprocess. 

In  the  hrst  two  cases,  it  is  appropriate  to  time-share  the  workstation  with  another 
user.  Furthermore,  it  is  possible  to  make  the  distributed  computation  transparent  to 
the  regular  user  of  the  workstation  by  assigning  a  low  runtime  priority  to  the  parallel 
processes  (UNIX  command  “nice”).  Because  the  regular  user’s  tasks  run  at  normal 
priority,  they  receive  the  full  attention  of  the  processor  immediately,  and  there  is  no 
loss  of  interactiveness.  After  the  user’s  tasks  are  serviced,  there  are  enough  CPU 
cycles  left  for  the  distributed  computation. 

In  the  third  case,  when  a  workstation  is  running  another  full-time  process  in  ad¬ 
dition  to  a  parallel  subprocess,  the  parallel  process  must  migrate  to  a  new  host  that 
is  free.  This  is  because  the  parallel  process  interferes  with  the  regular  user,  and  fur¬ 
ther,  the  whole  distributed  computation  slows  down  because  of  the  busy  workstation. 
Clearly,  such  a  situation  must  be  avoided. 

The  parallel  system  detects  the  need  for  migration  using  the  monitoring  program 
described  in  the  previous  section.  The  monitoring  program  checks  the  CPU  load 
of  every  workstation  via  the  UNIX  command  “uptime”,  and  signals  a  request  for 
migration  if  the  hve-minute-average  load  exceeds  a  pre-set  value,  typically  1.5.  The 
intent  is  to  migrate  only  if  a  second  full-time  process  is  running  on  the  same  host,  and 
to  avoid  migrating  too  often.  In  the  present  system,  there  is  typically  one  migration 
every  45  minutes  for  a  distributed  computation  that  uses  20  workstations  from  a  pool 
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of  25  workstations.  Also,  each  migration  lasts  about  30  seconds.  Thus,  the  cost  of 
migration  is  insignihcant  because  the  migrations  do  not  happen  too  often. 

During  a  migration,  a  precise  sequence  of  events  takes  place  in  order  for  the 
migration  to  complete  successfully, 

•  The  affected  process  A  receives  a  signal  to  migrate. 

•  All  the  processes  get  synchronized. 

•  Process  A  saves  its  state  into  a  dump  hie,  and  stops  running. 

•  Process  A  is  restarted  on  a  free  host,  and  the  distributed  computation  continues. 

Signals  for  migration  are  sent  through  an  interrupt  mechanism,  “kill  -USR2”  (see 
UNIX  manual).  In  this  way,  both  the  regular  user  of  a  workstation  and  the  monitoring 
program  can  request  a  parallel  subprocess  to  migrate  at  any  time. 

The  reason  for  synchronizing  all  the  processes  prior  to  migration,  is  to  simplify 
the  restarting  of  the  processes  after  the  migration  has  completed.  In  addition,  the 
synchronization  allows  more  than  one  process  to  migrate  at  the  same  time  if  it  is 
desired.  A  synchronization  scheme  is  employed  which  instructs  all  the  processes  to 
continue  running  until  a  chosen  synchronization  time  step,  and  then  to  pause  for  the 
migration  to  take  place.  The  details  of  the  synchronization  scheme  are  described  in 
the  appendix. 

When  all  the  processes  reach  the  synchronization  time  step,  the  processes  that 
need  to  migrate  save  their  state  and  exit,  while  they  notify  the  monitoring  program 
to  select  free  workstations  for  them.  The  other  parallel  processes  suspend  execution 
and  close  their  TCP/IP  communication  channels.  When  the  monitoring  program 
hnds  free  hosts  for  all  the  migrating  processes,  it  sends  a  CONT  signal  to  the  waiting 
processes.  In  response,  all  the  processes  re-open  their  communication  channels,  and 
the  distributed  computation  continues  normally. 

Overall,  the  migration  mechanism  is  designed  to  be  as  simple  as  possible.  In  fact, 
it  is  equivalent  to  stopping  the  computation,  saving  the  entire  state  on  disk,  and  then 
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restarting;  except,  only  the  state  of  the  migrating  process  is  saved  on  disk.  In  contrast 
to  this  simple  migration  mechanism,  the  migration  of  processes  is  a  challenging  task  in 
a  general  computing  environment  such  as  a  distributed  operating  system  [16].  In  the 
present  system,  the  migration  task  has  been  simplihed  because  the  parallel  processes 
have  been  designed  appropriately  to  accommodate  migration  easily. 

6.5.2  Sharing  the  network  and  file  server 

A  related  issue  to  sharing  the  workstations  with  other  users,  is  the  sharing  of  the 
network  and  the  hie  server.  A  distributed  program  must  be  carefully  designed  to 
make  sure  that  the  system  does  not  monopolize  the  network  and  the  hie  server. 
Abuse  of  shared  resources  is  very  easy  in  today’s  UNIX  operating  system  because 
there  are  no  direct  mechanisms  for  controlling  or  limiting  the  use  of  shared  resources. 
Thus,  a  program  such  as  FTP  (hie  transfer)  is  free  to  send  many  megabytes  of  data 
through  the  network,  and  to  monopolize  the  network,  so  that  the  network  appears 
“frozen”  to  other  users.  A  distributed  program  can  monopolize  the  network  in  a 
similar  way,  if  it  is  not  designed  carefully. 

The  present  parallel  distributed  system  does  not  monopolize  the  network  because 
it  includes  a  time  delay  between  successive  send-operations,  during  which  the  parallel 
processes  are  calculating  locally.  Moreover,  the  time  delay  increases  with  the  network 
trafhc  because  the  parallel  processes  must  wait  to  receive  data  before  they  can  start 
the  next  integration  step.  Thus,  there  is  an  automatic  feedback  mechanism  that  slows 
down  the  distributed  computation,  and  allows  other  users  to  access  the  network  at 
the  same  time. 

Another  situation  to  consider  is  when  the  parallel  processes  are  writing  data  to 
the  common  hie  system.  Specihcally,  when  all  the  parallel  processes  save  their  state 
on  disk  at  approximately  the  same  time  (a  couple  of  megabytes  per  process),  it  is 
very  easy  to  saturate  both  the  network  and  the  hie  server.  In  order  to  avoid  this 
situation,  a  constraint  is  imposed  that  the  parallel  processes  must  save  their  state 
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one  after  the  other  in  an  orderly  fashion,  allowing  sufficient  time  gaps  between,  so 
that  other  programs  can  use  the  network  and  the  hie  system.  Thus,  a  saving  operation 
that  would  take  30  seconds  and  monopolize  the  shared  resources,  now  takes  60  —  90 
seconds  but  leaves  free  time  slots  for  other  programs  to  access  the  shared  resources 
at  the  same  time.  Overall,  a  careful  design  has  made  the  distributed  system  mostly 
transparent  to  the  regular  users  of  the  workstations. 

6.6  Fluid  dynamics 

Having  described  the  basic  operation  of  the  distributed  system,  I  now  discuss  the 
parallelization  of  two  numerical  methods  for  simulating  huid  dynamics:  the  explicit 
hnite  difference  method,  and  the  lattice  Boltzmann  method.  Both  of  these  methods 
are  explicit,  and  are  well-suited  for  simulating  subsonic  how  which  involves  both 
hydrodynamics  and  acoustic  waves.  Further,  both  methods  are  well-suited  for  parallel 
computing  because  they  employ  local  interactions. 

The  explicit  hnite  difference  method  is  described  in  detail  in  chapter  3,  and  is  a 
straightforward  discretization  of  the  Navier  Stokes  equations.  Specihcally,  the  spatial 
derivatives  are  discretized  using  centered  differences  on  a  uniform  orthogonal  grid,  and 
the  time  derivatives  are  discretized  using  forward  Euler  differences.  For  the  purpose 
of  improving  numerical  stability,  the  density  equation  is  updated  using  the  values  of 
velocity  at  time  t  +  At.  In  other  words,  the  velocities  values  are  computed  hrst,  and 
then  the  density  values  are  computed  as  a  separate  step.  The  precise  sequence  of 
computational  steps  for  the  hnite  difference  method  is  as  follows, 

•  Calculate  14,14  (inner) 

•  Communicate:  send/recv  14,14  (boundary) 

•  Calculate  p  (inner) 

•  Communicate:  send/recv  p  (boundary) 
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•  Filter  p,Vx;,Vy  (inner) 

The  filter  that  is  included  above  is  crucial  for  simulating  subsonic  flow  at  high 
Reynolds  number  (fast  moving  flow).  The  simulation  of  subsonic  compressible  flow 
is  susceptible  to  slow-growing  numerical  oscillations.  The  hlter  prevents  instabilities 
by  dissipating  high  spatial  frequencies  whose  wavelength  is  comparable  to  the  grid 
mesh  size  (the  distance  between  neighboring  fluid  nodes).  The  same  hlter  is  used 
both  with  the  hnite  difference  method  and  with  the  lattice  Boltzmann  method.  A 
detailed  description  of  the  hlter  can  be  found  in  chapter  5. 

We  recall  from  chapter  4  that  the  lattice  Boltzmann  method  uses  two  kinds  of 
variables  to  represent  the  huid,  the  traditional  huid  variables  p,  14,14,  and  another 
set  of  variables  called  populations  Fi.  During  each  cycle  of  the  computation,  the  huid 
variables  p^Vx^Vy  are  computed  from  the  Fi,  and  then  the  p,Vx,Vy  are  used  to  relax 
the  Fi.  Subsequently,  the  relaxed  populations  are  shifted  to  the  nearest  neighbors  of 
each  huid  node,  and  the  cycle  repeats.  The  precise  sequence  of  computational  steps 
for  the  lattice  Boltzmann  method  is  as  follows, 

•  Relax  Fi  (inner) 

•  Shift  Fi  (inner) 

•  Communicate:  send/recv  Fi  (boundary) 

•  Calculate  p,Vx,Vy  from  Fi  (inner) 

•  Filter  p,Vx,Vy  (inner) 

Regarding  the  communication  of  boundary  values  by  the  hnite  difference  method 
(FD)  and  the  lattice  Boltzmann  method  (LB),  there  are  some  differences  that  will 
become  important  in  the  next  two  sections,  when  the  performance  of  the  parallel 
simulation  system  is  examined.  The  hrst  difference  is  that  FD  sends  two  messages 
per  computational  cycle  as  opposed  to  LB  which  sends  all  the  boundary  data  in 
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Figure  6-5:  Parallel  efficiency  in  2D  simulations  using  lattice  Boltzmann. 

one  message.  This  results  in  slower  communication  for  FD  when  the  messages  are 
small  because  each  message  has  a  signihcant  overhead  in  a  local-area  network.  The 
second  difference  is  that  LB  communicates  5  variables  (double  precision  floating-point 
numbers)  per  fluid  node  in  three  dimensional  problems,  while  FD  communicates  only 
4  variables  per  fluid  node.  In  two  dimensional  problems,  both  methods  communicate 
3  variables  per  fluid  node. 


6.7  Experimental  measurements  of  performance 


The  performance  of  the  parallel  simulation  system  has  been  measured  when  using  the 
finite  difference  method  and  the  lattice  Boltzmann  method  to  simulate  a  well-known 
problem  in  fluid  mechanics,  Hagen-Poiseuille  flow  through  a  rectangular  channel  (Sko- 
rdos  [48]  and  Landau&Lifshitz  [32,  p.51]).  Below,  measurements  of  the  parallel  effi¬ 
ciency  /  and  the  speedup  S  are  presented.  These  numbers  are  defined  as  follows. 


/ 


PT, 


~P 


(6.1) 
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Figure  6-6:  Parallel  speedup  in  2D  simulations  using  lattice  Boltzmann. 

where  Tp  is  the  elapsed  time  for  integrating  a  problem  using  P  processors,  and  Ti  is 
the  elapsed  time  for  integrating  the  same  problem  using  a  single  processor.  The  times 
Tp  and  Ti  for  integrating  a  problem  are  measured  by  averaging  over  20  consecutive 
integration  steps,  and  also  by  averaging  over  each  processor  that  participates  in  the 
parallel  computation.  The  resulting  average  is  the  time  interval  it  takes  to  perform 
one  integration  step.  The  UNIX  system  call  “gettimeofday”  is  used  to  obtain  accurate 
timings.  Although  most  measurements  are  taken  during  the  night,  the  workstations 
are  usually  busy  during  the  night  as  well  as  during  the  day.  To  avoid  situations 
where  the  Ethernet  network  is  overloaded  by  a  large  FTP  or  something  else,  each 
measurement  is  repeated  twice,  and  the  best  performance  is  selected. 

Twenty-hve  HP9000/700  workstations  are  used  which  are  connected  together  by  a 
shared-bus  Ethernet  network.  Sixteen  of  the  workstations  are  715/50  models,  six  are 
720  models,  and  three  are  710  models.  The  715/50  workstations  are  based  on  a  Risk 
processor  running  at  50  MHz,  and  have  an  estimated  performance  of  62  MIPS  and 
13  MFLOPS,  while  the  720  and  710  workstations  have  a  slightly  lower  performance. 

For  analysis  purposes,  we  dehne  the  speed  of  a  workstation  as  the  number  of  fluid 
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Figure  6-7:  Parallel  efficiency  in  2D  simulations  using  finite  differences. 


nodes  integrated  per  second,  where  the  number  of  fluid  nodes  does  not  include  the 
padded  areas  discussed  in  section  6.4.2.  The  table  below  presents  the  speed  of  the 
workstations  for  2D  and  3D  simulations  using  the  lattice  Boltzmann  method  (LB)  and 
the  finite  difference  method  (FD).  These  numbers  have  been  calculated  by  averaging 
over  simulations  of  different  size  grids  that  range  from  100^  to  300^  fluid  nodes  in 
2D,  and  from  10^  to  44®  in  3D.  Also,  the  speeds  have  been  normalized  relative  to  the 
speed  of  the  715/50  workstation. 


715/50 

710 

720 

LB  2D 

1.0  ±  .04 

.84±  .02 

.86±  .08 

LB  3D 

.51  ±  .01 

.40  ±  .01 

.42  ±  .02 

FD  2D 

1.24  ±  .1 

1.08  ±  .1 

1.17±  .1 

FD  3D 

1.0  ±  .1 

.85  ±  .1 

.94  ±  .1 

The  relative  speed  of  1.0  corresponds  to  39132  fluid  nodes  integrated  per  second. 

In  the  graphs  of  parallel  speedup  and  efficiency,  I  use  the  715/50  workstation 
to  represent  the  single  processor  performance.  I  do  not  use  the  performance  of  the 
slowest  workstation  (the  710  model)  for  normalization  purposes  because  it  would 
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Figure  6-8:  Parallel  speedup  in  2D  simulations  using  finite  differences. 

over-estimate  the  performance  of  the  system.  In  particular,  most  of  the  workstations 
are  715  models,  and  the  strategy  is  to  choose  715  models  first  before  choosing  the 
slightly  slower  710  and  720  models.  I  have  tested  that  the  speedup  achieved  by  sixteen 
workstations,  which  are  all  715  models,  does  not  change  if  one  or  two  workstations 
are  replaced  with  710  models.  Thus,  it  makes  sense  to  normalize  the  results  using  the 
performance  of  the  715  model. 

Figure  6-5  shows  the  efficiency  as  a  function  of  grain  size  for  (2x2),  (3x3), 
(4x4),  and  (5x4)  decompositions  (triangles,  crosses,  squares,  circles).  The  horizontal 
axis  plots  the  square  root  of  number  of  nodes  N  of  each  subregion.  We  see  that 
good  performance  is  achieved  in  two-dimensional  simulations  when  the  subregion  per 
processor  is  larger  than  100^  fluid  nodes.  In  the  next  section,  a  theoretical  model  of 
parallel  efficiency  is  presented  which  predicts  very  accurately  the  experimental  results 
shown  in  figure  6-5  and  in  the  other  figures  also.  Figure  6-6  shows  the  speedup  for 
the  lattice  Boltzmann  method  (LB),  and  figures  6-7  and  6-8  show  the  efficiency  and 
speedup  for  the  finite  difference  method  (FD). 

We  notice  one  difference  between  the  FD  and  LB  efficiency  curves:  the  efficiency 
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Figure  6-9:  The  Ethernet  network  performs  well  for  2D  simulations  (triangles),  but 
poorly  for  3D  simulations  (crosses). 


decreases  more  rapidly  for  FD  than  LB  as  the  subregion  per  processor  decreases. 
To  understand  this  difference,  we  quote  a  general  formula  for  the  parallel  efficiency, 
which  is  derived  in  the  next  section  (see  equation  6.8), 

T 

com 

Tcalc 

where  Tcom  and  T^aic  are  the  communication  and  the  computation  time  it  takes  to 
perform  one  integration  step.  We  observe  that  T^aic  is  smaller  for  FD  than  LB  (see  the 
table  of  speeds  earlier),  and  moreover  that  Tcom  becomes  larger  for  FD  than  LB  as  the 
subregion  per  processor  decreases.  The  latter  is  true  because  each  message  in  a  local- 
area  network  incurs  an  overhead,  and  FD  communicates  two  messages  per  integration 
step  as  opposed  to  LB  which  communicates  only  one  message  per  integration  step  (see 
end  of  section  6.6).  Because  of  these  differences  between  FD  and  LB,  the  efficiency 
decreases  more  rapidly  for  FD  than  LB  as  the  subregion  per  processor  decreases. 

Next,  we  compare  the  efficiency  of  three-dimensional  simulations  versus  two- 
dimensional  ones.  Figure  6-9  plots  the  efficiency  of  2D  and  3D  simulations  as  a 
function  of  the  number  of  processors  P.  Here,  a  problem  is  simulated  which  grows 


(6.2) 
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parallel  grain  size^'^^ 

Figure  6-10:  Parallel  efficiency  in  3D  simulations  using  the  lattice  Boltzmann  method. 

linearly  with  the  number  of  processors  P,  and  is  decomposed  as  (P  X  1)  in  2D,  and  as 
(P  X  1  X  1)  in  3D.  The  subregion  per  processor  is  held  hxed  at  120^  nodes  in  2D,  and 
25^  nodes  in  3D,  which  are  comparable  sizes,  equal  to  about  14,500  fluid  nodes  per 
processor.  We  see  that  the  efficiency  remains  high  in  2D  (triangles),  and  decreases 
quickly  in  3D  (crosses)  as  the  number  of  processors  increases.  This  is  because  the 
total  traffic  through  the  shared-bus  network  increases  in  proportion  to  the  number 
of  processors,  and  this  affects  Tcom  in  equation  6.2  as  explained  in  more  detail  in  the 
next  section.  Also,  we  note  that  3D  requires  much  more  data  to  be  communicated 
per  step  than  2D.  Thus,  Tcom  increases  faster  for  3D  than  2D,  and  the  efficiency  drops 
faster  in  the  case  of  3D  simulations. 

Another  way  of  examining  the  efficiency  of  3D  simulations  is  shown  in  hgures  6-10 
and  6-11.  Figure  6-10  plots  the  efficiency  against  the  size  of  the  subregion  for  different 
decompositions  (2  X  2  X  2),  (3  X  2  X  2),  etc.  We  can  see  that  the  efficiency  is  rather 
poor.  Figure  6-11  plots  the  speedup  against  the  total  size  of  the  problem.  We  can  see 
that  the  speedup  does  not  improve  when  hner  decompositions  are  employed  because 
the  network  is  the  bottleneck  of  the  computation. 
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Figure  6-11:  Parallel  speedup  in  3D  simulations  using  the  lattice  Boltzmann  method. 

The  results  shown  in  hgures  6-10  and  6-11  have  been  obtained  using  the  lattice 
Boltzmann  method.  The  parallel  efficiency  of  the  hnite  difference  method  (FD)  in  3D 
simulations  is  even  worse  than  the  lattice  Boltzmann  method  (LB),  and  is  not  shown 
here.  The  FD  efficiency  is  worse  than  LB  because  the  FD  computes  twice  as  fast  as 
LB  per  integration  step  (see  earlier  table  of  speeds),  which  makes  the  ratio  Tcom/Tcaic 
larger  for  FD  than  LB,  and  leads  to  lower  efficiency  according  to  equation  6.2. 

Another  point  is  that  the  low  efficiency  of  3D  simulations  is  accompanied  by  fre¬ 
quent  network  errors  because  of  excessive  network  traffic.  In  particular,  the  TCP/IP 
protocol  fails  to  deliver  messages  after  excessive  retransmissions.  Both  the  low  effi¬ 
ciency,  and  the  network  errors  indicate  the  need  for  a  faster  network,  or  dedicated 
connections  between  neighboring  processors  in  order  to  perform  3D  simulations  effi¬ 
ciently. 
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6.8  Theoretical  analysis  of  parallel  efficiency 


In  order  to  understand  better  the  experimental  results  of  the  previous  section,  we 
discuss  here  a  theoretical  model  of  the  parallel  efficiency  of  local-interaction  problems. 
In  particular,  we  derive  a  formula  for  the  parallel  efficiency  in  terms  of  the  parallel 
grain  size  (the  size  of  the  subregion  that  is  assigned  to  each  processor),  the  speed 
of  the  processors,  and  the  speed  of  the  communication  network.  The  analysis  is 
based  on  two  assumptions:  (i)  the  computation  is  completely  parallelizable,  and 
(ii)  the  communication  does  not  overlap  in  time  with  the  computation.  The  hrst 
assumption  is  valid  for  local-interaction  problems,  and  the  second  assumption  is  valid 
for  the  present  distributed  system.  The  extension  of  the  analysis  to  situations  where 
communication  and  computation  overlap  in  time  is  straightforward  as  we  shall  see 
afterwards. 

We  hrst  examine  the  relationship  between  the  efficiency  and  the  processor  utiliza¬ 
tion.  We  dehne  the  efficiency  /  as  the  speedup  S  divided  by  the  number  of  processors 
P.  Further,  we  dehne  the  speedup  S  as  the  ratio  TijTp  of  the  total  time  it  takes  to 
solve  a  problem  using  one  processor,  denoted  Ti,  divided  by  the  total  time  it  takes  to 
solve  the  same  problem  using  P  processors,  denoted  Tp.  In  other  words,  we  have  the 
following  expression. 


r  _  S  _  Tl 

^  P  PT„ 


(6.3) 


We  dehne  the  processor  utilization  g  as  the  fraction  of  time  spent  for  computing,  de¬ 
noted  Tca/c,  divided  by  the  total  time  spent  for  solving  a  problem  which  includes  both 
computing  and  waiting  for  communication  to  complete.  Also,  we  use  the  simplifying 
assumption  that  the  communication  and  the  computation  do  not  overlap  in  time,  so 
that  we  dehne  Team  as  the  time  spent  for  communication  without  any  computation 
occurring  during  this  time.  Thus,  we  have  the  following  expression. 


T 

-J-  ri 


g  = 


Tcalc  +  Tc, 


=  1  + 


T 

-J-  r< 


T 

-J-  n 


-1 


(6,4) 


To  compare  /  and  g,  we  note  that  the  values  of  both  /  and  g  range  between  the 
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following  limits, 

0  <  5  <  1 

0</<l 


(6.5) 


for  the  worst  case  and  the  best  case  respectively.  We  expect  that  high  utilization  g 
corresponds  to  high  parallel  efficiency  /.  However,  this  depends  on  the  problem  that 
we  are  trying  to  compute  in  parallel. 

In  the  special  case  of  a  problem  that  is  completely  parallelizable,  the  processor 
utilization  g  is  exactly  equal  to  the  parallel  efficiency  /.  To  show  this,  we  use  the 
following  relation  as  the  dehnition  of  a  problem  being  completely  parallelizable. 


Tcalc 


TA 

p 


(6.6) 


Then,  we  also  use  the  assumption  that  communication  and  computation  do  not  over¬ 
lap  in  time,  so  that  we  can  obtain  a  second  relation. 


(Tcalc  +  Team)  —  Tp  (6-7) 

By  substituting  equations  6.6  and  6.7  into  equation  6.3,  and  comparing  with  equa¬ 
tion  6.4,  we  arrive  at  the  desired  result  that  the  parallel  efficiency  is  exactly  equal  to 
the  processor  utilization, 

f  =  9  =  (l  +  ^)'“  (6.8) 

The  above  equation  has  been  derived  under  the  assumption  that  communication  and 
computation  do  not  overlap  in  time.  If  this  assumption  is  violated  in  a  practical 
situation,  then  the  communication  time  Team  should  be  replaced  with  a  smaller  time 
interval,  the  effective  communication  time.  This  modiheation  does  not  change  the 
conclusion  f  =  g^A  simply  gives  higher  values  of  efficiency  and  utilization. 

To  proceed  further,  we  need  to  hud  how  the  ratio  Tcom/ Tcalc  depends  on  the  size  of 
the  subregion.  First,  we  observe  that  T^aic  is  proportional  to  the  size  of  the  subregion. 
If  N  is  the  size  of  the  subregion  (the  number  of  parallel  nodes  that  constitute  one 
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subregion),  we  can  write, 

Tcaic  =  (6.9) 

where  Ucaic  is  a  constant,  the  computational  speed  of  the  processors  for  the  specihc 
problem  at  hand.  In  a  similar  way,  we  seek  to  hnd  a  formula  for  the  communication 
time  Team  in  terms  of  the  size  of  the  subregion  that  is  assigned  to  each  processor.  As 
a  hrst  model,  we  write  the  following  simple  expression, 

Tco-m  =  (6.10) 

where  Nc  is  the  number  of  communicating  nodes  in  each  subregion,  namely  the  outer 
surface  of  each  subregion.  The  factor  Ucom  represents  the  speed  of  the  communication 
network. 

For  analysis  purposes,  we  want  to  know  exactly  how  Nc  varies  with  the  size  of  the 
subregion  N.  We  consider  the  geometry  of  a  subregion  in  two  dimensions.  We  can  see 
that  the  boundary  of  a  subregion  is  one  power  smaller  than  the  volume  expressed  in 
terms  of  the  number  of  nodes.  For  example,  if  we  consider  square  subregions  of  size 
nodes,  the  enclosing  boundary  contains  IL  nodes,  and  the  ratio  of  communicating 
nodes  to  the  total  number  of  nodes  per  subregion  can  be  as  large  as  Ij L.  In  general, 
we  have  the  following  relations. 


Nc  =  m  N^^'^ 

(6.11) 

Nc  =  niN'^/^ 

(6,12) 

in  two  and  three  dimensions  respectively,  where  the  constant  m  depends  on  the  geom¬ 
etry  of  the  decomposition.  For  example,  if  the  decomposition  of  a  problem  is  (P  X  1), 
then  m  =  2  because  each  subregion  communicates  with  its  left  and  right  neighbors 
only.  The  following  table  gives  m  for  a  few  decompositions  which  are  used  in  the 
performance  measurements  of  section  6.7, 


P  X  1 

2x2 

3x3 

4x4 

5x4 

m 

2 

2 

3 

4 

4 
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If  we  introduce  the  above  formulas  for  Nc  and  m  into  equation  6.8,  we  obtain  the 
following  expressions  for  the  parallel  efficiency  of  a  local-interaction  problem  in  two 
and  three  dimensions  respectively, 

f  =  (l  +  N-^'^  '  (613) 

^  com  ' 

f  =  (l  +  (6.14) 

The  above  equations  show  that  if  N  is  sufficiently  large  compared  to  the  term 
mUcorn/Ucaic^  then  high  parallel  efficiency  can  be  achieved. 

A  few  comments  are  in  order.  First,  we  must  remember  that  in  practice  we  can 
not  increase  arbitrarily  the  size  of  the  subregion  per  processor  in  order  to  achieve 
high  efficiency.  This  is  because  the  computation  may  take  too  long  to  complete, 
and  because  the  memory  of  each  workstation  is  limited.  In  the  present  system,  each 
workstation  has  maximum  memory  32  megabytes,  and  a  large  part  of  this  memory 
is  taken  by  other  programs,  and  other  users.  A  practical  upper  limit  of  how  much 
memory  can  be  used  per  workstation  is  15  megabytes,  which  corresponds  to  300^  fluid 
nodes  in  2D  simulations  and  40^  fluid  nodes  in  3D  simulations. 

In  2D  simulations,  the  upper  limit  of  300^  fluid  nodes  per  subregion  is  large  enough 
to  achieve  high  efficiency.  As  we  saw  in  hgure  6-5,  high  efficiency  is  achieved  when 
the  subregion  per  processor  is  larger  than  100^  fluid  nodes.  By  contrast,  in  3D 
simulations  the  upper  limit  of  40^  fluid  nodes  per  subregion  is  too  small  to  achieve 
high  efficiency.  Further,  the  efficiency  depends  on  the  size  of  the  subregion  as 
in  3D  versus  in  2D,  as  can  be  seen  from  equations  6.13  and  6.14.  This  means 

that  the  size  of  the  subregion  N  must  increase  much  faster  in  3D  than  in  2D  to  achieve 
similar  improvements  in  efficiency.  Because  of  this  fact,  achieving  high  efficiency  in 
3D  simulations  is  much  more  difficult  than  in  2D  simulations. 

Having  described  the  basics  of  the  model  of  parallel  efficiency,  we  now  discuss  a 
small  improvement  of  the  model.  We  observe  that  in  the  case  of  a  shared-bus  network 
the  communication  time  Team  must  depend  on  the  number  of  processors  that  are  using 
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parallel  grain  size 

Figure  6-12:  Theoretical  model  of  parallel  efficiency  for  two-dimensional  subregions 
of  size  N . 


the  network.  In  particular,  if  we  assume  that  all  the  processors  access  the  shared-bus 
network  at  the  same  time,  then  the  communication  time  Team  must  increase  linearly 
with  the  number  of  processors.  Based  on  this  assumption,  we  rewrite  equation  6.10 
for  Team  as  follows. 


T  = 

com 


niN^/^P  -  1) 
ITom 


(6.15) 


for  the  case  of  two  dimensional  problems.  The  constant  I4om  is  the  speed  of  com¬ 
munication  when  there  are  only  two  processors  sharing  the  network.  Using  the  new 
expression  for  Tcom,  the  equation  of  parallel  efficiency  in  two  dimensions  becomes  as 
follows, 

f  =  (l  +  [P  -  1)  '  (6.16) 

^  *  com  ' 


Below,  this  model  is  tested  by  comparing  the  efficiency  which  is  predicted  by  the 
model  against  the  experimentally  measured  efficiency  of  section  6.7. 

Figure  6-12  plots  the  efficiency  /  versus  according  to  formula  6.16,  using 

U caiclUcom  =  2/3.  The  four  curves  marked  with  triangle,  cross,  square,  circle  cor¬ 
respond  to  different  numbers  of  processors  P  =  4,  9, 16,  20  and  also  different  values 
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Figure  6-13:  Theoretical  model  of  parallel  efficiency  which  assumes  that  the  commu¬ 
nication  time  increases  linearly  with  the  number  of  processors. 

of  m  =  2,  3,  4,  4  which  depends  on  the  geometry  of  the  decomposition  as  explained 
earlier.  A  comparison  between  the  predicted  efficiency  shown  in  figure  6-12  and  the 
experimentally  measured  efficiency  shown  in  figure  6-5  reveals  good  agreement  when 
the  subregion  per  processor  is  larger  than  N  >  100^.  However,  for  small  subregions, 
N  <  100^,  the  predicted  efficiency  is  too  high  compared  to  the  experimental  effi¬ 
ciency.  The  reason  for  this  is  that  messages  in  a  local-area  network  have  a  large 
overhead  which  becomes  important  when  the  messages  are  small,  namely,  when  the 
subregion  per  processor  is  smaller  than  N  <  100^  fluid  nodes.  The  overhead  of  small 
messages  leads  to  a  smaller  communication  speed  Kom,  and  a  corresponding  decrease 
of  efficiency  /.  We  have  not  attempted  to  model  the  overhead  of  small  messages  here. 

Another  way  of  examining  the  validity  of  equation  equation  6.16  is  to  plot  the 
efficiency  /  versus  the  number  of  processors  P  while  keeping  all  other  parameters 
constant.  In  figure  6-13,  the  efficiency  of  2D  simulations  is  plotted  according  to 
equation  6.16  using  N  =  125^.  We  set  Ucaiclycom  =  2/3  as  we  did  in  figure  6-12,  and 
we  set  m  =  2  because  each  subregion  communicates  with  its  left  and  right  neighbors 
only.  For  comparison  purposes,  the  efficiency  of  3D  simulations  is  also  plotted,  using 
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N  =  25^  and  m  =  2.  The  computational  speed  is  half  as  large  in  3D  than  in  2D, 
and  the  communication  of  each  fluid  node  in  3D  requires  5/3  as  much  data  as  in 
2D.  Taking  these  numbers  into  account,  we  can  write  the  following  expression  for  the 
parallel  efficiency  of  3D  simulations, 

/=  fl  +  -  (P  -  1)  (6.17) 

\  6  ^com  '' 

where  the  factor  5/6  arises  because  the  2D  values  of  Ucaic  and  V/om  are  used  which 
give  Ucaic lUcom  —  2/3. 

If  we  compare  the  predicted  efficiency  shown  in  hgure  6-13  against  the  experimen¬ 
tally  measured  efficiency  shown  in  hgure  6-9,  we  can  see  that  there  is  good  agreement. 
Also,  the  overhead  of  small  messages,  mentioned  earlier,  does  not  affect  the  predicted 
efficiency  in  this  case  because  the  subregion  per  processor  is  large,  N  =  125^  in 
2D,  and  25^  in  3D.  Overall,  there  is  reasonable  agreement  between  the  theoretical 
model  and  the  experimental  measurements  of  parallel  efficiency.  The  model  can  be 
improved  further,  if  desired,  by  employing  more  sophisticated  expressions  for  the  com¬ 
munication  time  Tcom  in  equation  6.15  which  describes  the  behavior  of  the  shared-bus 
Ethernet  network. 


6.9  Conclusion 

An  effective  approach  of  simulating  huid  dynamics  on  a  cluster  of  non-dedicated 
workstations  has  been  presented.  The  approach  is  particularly  good  for  simulating 
subsonic  flows  which  involve  both  hydrodynamics  and  acoustic  waves.  A  parallel 
simulation  system  has  been  developed  and  applied  to  solve  a  real  problem,  the  direct 
simulation  of  flue  pipes  of  wind  musical  instruments. 

The  system  achieves  concurrency  by  decomposing  the  flow  problem  into  subre¬ 
gions,  and  by  assigning  the  subregions  to  parallel  processes.  The  use  of  explicit 
numerical  methods  leads  to  minimum  communication  requirements.  The  parallel 
processes  automatically  migrate  from  busy  hosts  to  free  hosts  in  order  to  exploit 
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the  unused  cycles  of  non-dedicated  workstations,  and  to  avoid  disturbing  the  regular 
users.  Typical  simulations  achieve  80%  parallel  efficiency  (speedup/processors)  using 
20  HP-Apollo  workstations. 

Detailed  measurements  of  the  parallel  efficiency  of  2D  and  3D  simulations  have 
been  presented,  and  a  theoretical  model  of  efficiency  has  been  developed  which  hts 
closely  the  measurements.  The  measurements  show  that  a  shared-bus  Ethernet 
network  with  fOMbps  peak  bandwidth  (megabits  per  second)  is  sufficient  for  two- 
dimensional  simulations  of  subsonic  flow,  but  is  limited  for  three-dimensional  simu¬ 
lations.  It  is  expected  that  the  use  of  new  technologies  in  the  near  future  such  as 
Ethernet  switches,  EDDI  and  ATM  networks  will  make  practical  three-dimensional 
simulations  of  subsonic  flow  on  a  cluster  of  workstations. 


6.10  Appendix 

The  appendix  describes  certain  aspects  of  the  distributed  system  that  are  not  vital 
for  a  general  reading,  but  are  useful  to  someone  who  is  interested  in  implementing  a 
distributed  system  similar  to  the  present  one. 

6.10.1  Synchronization  issues 

The  synchronization  between  distributed  processes  (see  section  6.4.2)  can  be  violated 
in  situations  such  as  the  following.  Let  us  suppose  that  process  A  stops  execution 
after  communicating  its  data  for  integration  step  N.  The  nearest  neighbor  B  can 
integrate  up  to  step  +  1  and  then  stop.  Process  B  can  not  integrate  any  further 
without  receiving  data  for  integration  step  +  1  from  process  A.  However,  the  next 
to  nearest  neighbor  can  integrate  up  to  step  A^  +  2,  and  so  on.  If  we  consider  a  two- 
dimensional  decomposition  (J  X  K)  of  a  problem,  the  largest  difference  in  integration 
step  between  two  processes  is  AA^, 


NN  =  max^J,  K)  —  1 


(6.18) 
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assuming  that  neighbors  depend  on  each  other  along  the  diagonal  direction  (this 
corresponds  to  a  full  stencil  of  local  interactions  as  shown  in  hgure  6-4).  If  neighbors 
depend  on  each  other  along  the  horizontal  and  vertical  directions  only  (this  is  the 
star  stencil  of  hgure  6-4),  then  the  largest  difference  in  integration  step  between  two 
processes  becomes, 

AiV  =  (J- 1)  +  (Jh  -  1)  (6.19) 

These  worst  cases  of  un-synchronization  are  important  during  the  migration  of  pro¬ 
cesses  because  a  precise  global  synchronization  is  required  then,  as  explained  in  sec¬ 
tion  6.5. 

The  synchronization  algorithm  that  is  used  during  process  migration  is  as  follows. 
First,  we  send  a  synchronization  request  to  all  the  processes  by  means  of  a  UNIX 
interrupt.  In  response  to  the  request,  every  process  writes  the  current  integration 
time  step  into  a  shared  hie  (using  hie  locking  semaphores,  and  append  mode).  Then, 
every  process  examines  the  shared  hie  to  hnd  the  largest  integration  time  step  Umax 
among  all  the  processes.  Further,  every  process  chooses  (Tmax+l)  to  be  the  upcoming 
synchronization  time  step,  and  continues  running  until  it  reaches  this  time  step.  It 
is  important  that  all  the  processes  can  reach  the  synchronization  time  step,  and  that 
no  process  continues  past  the  synchronization  time  step. 

The  above  algorithm  hnds  the  smallest  synchronization  time  step  that  is  possible 
at  any  given  time,  so  that  a  pending  migration  can  take  place  as  soon  as  possible. 

6.10.2  Alternative  communication  mechanisms 

A  minor  efhciency  issue  with  regard  to  TCP/IP  communication  (see  section  6.4.2) 
is  the  order  in  which  the  neighboring  processes  communicate  with  each  other.  One 
way  is  for  each  parallel  process  to  communicate  with  its  neighbors  on  a  hrst-come- 
hrst-served  basis.  An  alternative  way  is  to  impose  a  strict  ordering  on  the  way  the 
processes  communicate  with  each  other.  For  example,  we  consider  a  one-dimensional 
decomposition  (J  X  1)  of  a  problem  with  non-periodic  outer-boundaries  where  each 
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process  receives  data  from  its  left  neighbor  before  it  can  send  data  to  its  right  neighbor. 
Then,  the  leftmost  process  No.  1  will  access  the  network  hrst,  and  the  nearest-neighbor 
process  No.  2  will  access  the  network  second,  and  so  on.  The  intent  of  such  ordering 
is  to  pipeline  the  messages  through  the  shared-bus  network  in  a  strict  fashion  in  an 
attempt  to  improve  performance.  However,  it  does  not  work  very  well  if  one  process  is 
delayed  because  all  the  other  processes  are  delayed  also.  Small  delays  are  inevitable  in 
time-sharing  UNIX  systems,  and  strict  ordering  amplihes  them  to  global  delays.  By 
contrast,  asynchronous  hrst-come-hrst-served  communication  allows  the  computation 
to  proceed  in  those  processes  that  are  not  delayed,  and  better  performance  is  achieved 
overall.  In  the  parallel  system,  hrst-come-hrst-served  communication  is  implemented 
using  the  “select”  system  call  of  sockets  (see  UNIX  manual). 

Regarding  the  choice  of  communication  protocol,  the  TCP/IP  protocol  is  used 
because  it  is  very  simple  as  explained  in  section  6.4.2.  Apart  from  the  TCP/IP 
protocol,  another  protocol  that  is  popular  in  distributed  systems  is  the  UDP/IP 
protocol,  also  known  as  datagrams.  The  UDP/IP  protocol  is  similar  to  TCP/IP 
with  one  major  difference:  There  is  no  guaranteed  delivery  of  messages.  Thus,  the 
distributed  program  must  check  that  messages  are  delivered,  and  resend  messages  if 
necessary,  which  is  a  considerable  effort.  However,  the  beneht  is  that  the  distributed 
program  has  more  control  of  the  communication.  For  example,  a  distributed  program 
could  take  advantage  of  knowing  the  special  properties  of  its  own  communication 
to  achieve  better  results  than  the  TCP/IP  standard.  Also,  another  advantage  is 
robustness  in  the  case  of  network  errors  that  occur  under  very  high  network  traffic. 
For  example,  when  TCP/IP  fails,  it  is  hard  to  know  which  messages  need  to  be  resent. 
In  UDP/IP  the  distributed  program  controls  precisely  which  data  is  sent  and  when, 
so  that  the  failure  problem  is  handled  directly.  Despite  these  advantages  of  UDP/IP 
over  TCP/IP,  I  have  chosen  to  work  with  TCP/IP  because  of  its  simplicity. 
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6.10.3  Performance  bugs  to  avoid 

In  section  6.7,  we  examined  the  performance  of  the  HP-Apollo  workstations.  It  should 
be  noted  that  the  performance  of  the  HP9000/700  Apollo  workstations  can  degrade 
dramatically  at  certain  grid  sizes  by  a  factor  of  two  or  more,  but  there  is  an  easy  way 
to  hx  the  problem.  The  loss  of  performance  occurs  when  the  length  of  the  arrays  in 
the  program  is  a  near  multiple  of  4096  bytes  which  is  also  the  virtual-memory  page 
size.  This  suggests  that  the  loss  of  performance  is  related  to  the  prefetching  algorithm 
of  the  CPU  cache  of  the  HP9000/700  computers.  To  avoid  the  loss  of  performance, 
the  arrays  can  be  lengthened  with  200-300  bytes  when  their  length  is  a  near  multiple 
of  4096.  This  modihcation  eliminates  the  loss  of  performance. 

Another  problem  that  can  lead  to  loss  of  performance  is  the  handling  of  floating¬ 
point  exceptions.  When  an  underflow  exception  occurs,  the  HP9000/700  workstations 
trap  into  the  system  kernel  by  default,  and  this  causes  considerable  slow-down.  The 
slow-down  is  amplihed  in  a  distributed  computation  because  if  one  processor  slows 
down,  all  the  processors  slow  down.  A  particular  situation  in  fluid  dynamics  occurs 
when  the  passage  of  an  acoustic  wave  causes  underflow  exceptions  to  different  proces¬ 
sors  at  different  times.  Then,  during  the  passage  of  the  acoustic  disturbance,  all  the 
processors  are  delayed.  Such  problems  can  be  observed  at  the  beginning  of  the  sim¬ 
ulation  when  the  fluid  begins  to  move  from  an  initial  non-moving  state  (namely,  the 
density  variations  of  the  fluid  are  equal  exactly  to  zero  at  startup).  Fortunately,  there 
is  a  simple  solution  which  is  to  avoid  initializing  the  fluid  density  variations  equal  to 
exact  zero;  for  example,  an  initial  density  gradient  with  relative  size  10“  10  is  practi¬ 
cally  the  same  as  zero  in  the  present  situation.  Such  a  non-zero  initialization  avoids 
the  floating-point  underflow.  Another  solution  which  is  available  in  the  HP9000/715 
workstation  models  but  not  in  the  720  models,  is  to  set  “fast  underflow  mode”  using 
the  system  call  “fpsetfastmode”  of  HPUX.  Fastmode  causes  the  hardware  to  simply 
substitute  a  zero  for  the  result  of  an  operation  that  underflows,  without  a  system 
fault  and  without  any  delay. 
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Figure  6-14:  Communication  of  data  across  boundaries  (dashed  lines)  for  the  hnite 
difference  method. 

6.10.4  Communication  of  fluid  flow  boundaries 

In  section  6.6,  the  parallelization  of  explicit  numerical  methods  for  fluid  dynamics 
was  discussed.  Here,  the  precise  manner  in  which  the  fluid  flow  boundaries  are  com¬ 
municated  between  the  parallel  processes  is  described.  The  hnite  difference  method 
communicates  the  huid  variables  (p,  14,  K/)  in  2D,  and  (p,  14,  K/,  14)  in  3D.  The  lat¬ 
tice  Boltzmann  method  communicates  the  moving  populations  Fi  that  must  be  shifted 
across  a  boundary.  There  are  3  moving  populations  Fi  in  each  direction  in  2D,  and  5 
moving  populations  Fi  in  each  direction  in  3D. 

Figures  6-14  and  6-15  show  how  the  boundary  values  are  communicated  along  the 
X  and  y  directions.  In  the  case  of  the  hnite  difference  method  (hgure  6-14),  the  values 
on  the  inner  nodes  next  to  the  padded  area  of  region  A  are  copied  onto  the  padded  area 
of  region  B.  In  the  case  of  the  lattice  Boltzmann  method  (hgure  6-15)  the  values  on  the 
padded  area  of  region  A  are  copied  onto  the  inner  nodes  of  region  B.  The  differences 
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Figure  6-15:  Communication  of  data  across  boundaries  (dashed  lines)  for  the  lattice 
Boltzmann  method. 

in  data  movement  are  due  to  the  fact  that  the  hnite  difference  method  communicates 
the  fluid  variables  p,Vx;,Vy,  while  the  lattice  Boltzmann  method  communicates  the 
moving  populations  Fi.  The  moving  populations  Fi  are  shifted  to  the  padded  areas 
prior  to  the  communication  operations. 

The  corner  nodes  of  each  rectangular  region  need  special  attention  because  they 
connect  regions  diagonally  (for  example,  regions  A  and  C  in  hgure  6-14).  A  simple  way 
of  handling  diagonal  connections  is  to  communicate  along  the  x-direction  hrst,  then 
along  the  y-direction,  then  along  the  z-direction.  Thus,  the  diagonal  corner  values 
are  updated  correctly  at  the  expense  of  constraining  the  order  of  communication. 
The  lattice  Boltzmann  method  obeys  this  constraint.  The  hnite  difference  method 
however  does  not  obey  this  constraint,  and  it  ignores  the  corner  points.  This  is  a 
special  case  because  in  the  present  simulations  the  differencing  stencils  are  cross¬ 
shaped  without  diagonals.  This  is  exploited  so  that  the  communication  operations  of 
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the  finite  difference  method  take  place  in  any  order  between  the  x,y,z  directions,  and 
there  are  no  diagonal  dependencies. 


Chapter  7 

Music  by  flue  pipes 


First,  I  review  flow-generated  sound  phenomena  in  general,  and  then  I  focus  on  sim¬ 
ulations  of  flue  pipes.  The  results  presented  here  are  a  continuation  of  the  computer 
simulations  and  physical  measurements  already  described  in  chapter  1. 

7.1  Background 

7.1.1  Related  computational  work 

Related  work  on  simulating  flow-generated  sound  phenomena  has  been  limited,  and 
all  of  the  previous  studies  have  employed  incompressible  flow  equations  as  far  as 
I  know.  For  example,  Ohring  [35]  has  simulated  jets  of  air  impinging  on  a  sharp 
triangular  wedge  using  an  incompressible  flow  calculation.  Peters  [37]  has  employed 
vortex  methods  to  simulate  the  initial  stages  of  blowing  air  through  a  flue  channel 
and  also  the  flow  of  gas  through  industrial  pipe  systems.  Harding  [24]  has  used 
an  incompressible  flow  calculation  as  a  source  term  to  a  wave  equation  in  order  to 
study  the  sound  generated  by  an  obstruction  inside  a  channel.  As  explained  earlier  in 
chapter  1,  Harding’s  approach  applies  only  when  the  acoustic  waves  do  not  interact 
with  the  hydrodynamic  flow.  In  the  case  of  flue  pipes,  acoustics  and  hydrodynamics 
must  be  simulated  together  using  the  compressible  Navier  Stokes  equations. 
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7.1.2  Catalogue  of  flow-generated  sound  phenomena 

There  is  a  very  wide  variety  of  sound  phenomena  which  are  triggered  and  sustained 
by  the  flow  of  air  (or  any  fluid  medium)  and  the  interaction  between  the  flow  and 
solid  obstacles.  The  following  are  some  well-known  examples. 

•  Flue  pipes  exploit  the  oscillations  of  narrow  jets  of  air  that  impinge  a  sharp 
obstacle  called  the  labium.  The  operation  of  flue  pipes  depends  on  the  cou¬ 
pling  between  acoustic  and  hydrodynamic  oscillations.  Flue-based  musical  in¬ 
struments  include  the  baroque  recorders,  the  flutes,  the  organ  pipes  found  in 
cathedrals,  the  pipes  used  by  Latin  America  cultures,  the  pan-pipes  of  ancient 
Greece,  and  the  bamboo  flutes  found  in  many  Pharaoh’s  tombs  inside  Fgyptian 
pyramids. 

•  A  Helmholtz  resonator  (a  glass  bottle)  can  be  used  in  the  place  of  a  long  pipe. 
Blowing  a  narrow  jet  of  air  over  the  opening  of  a  bottle  generates  pure  tones  of 
a  dehnite  frequency. 

•  The  sound  generated  by  swinging  around  a  plastic  tube  with  a  diameter  of  1  —  5 
cm  (children  often  do  this)  is  probably  similar  to  blowing  air  over  a  pipe  or  a 
bottle.  An  observer  that  stays  with  the  moving  tube  sees  the  air  rushing  over 
the  opening  of  the  tube.  The  layer  of  air  (boundary  layer)  next  to  the  opening 
of  the  tube  is  very  unstable,  and  can  easily  start  oscillating  near  the  resonant 
acoustic  frequencies  of  the  pipe. 

•  Whistles  are  close  relatives  to  flue  pipes.  For  instance,  in  human  whistling, 
the  teeth  and  the  tongue  are  used  to  form  a  narrow  jet  of  air.  The  jet  of  air 
is  blown  against  an  obstruction  of  appropriate  shape  (the  lips).  The  mouth 
probably  acts  as  a  resonator  in  this  case. 

•  Another  type  of  whistling  (lower  frequency  than  lip-whistling)  is  possible  by 
putting  one’s  hands  together  to  form  a  cavity  with  a  narrow  opening  between 
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the  two  thumbs,  and  by  blowing  a  narrow  jet  of  air  (using  one’s  lips  and  teeth) 
tangentially  onto  the  opening  between  the  two  thumbs.  The  thumb-nails  should 
be  positioned  below  one’s  nose  in  order  to  blow  air  tangentially  onto  the  opening 
between  the  thumbs.  If  this  has  not  been  done  before,  it  takes  some  experimen¬ 
tation  at  hrst  to  get  it  right. 

•  Besides  flue  pipes  and  whistles,  another  flow-generated  sound  phenomenon  is 
the  Aeolian  tone.  An  Aeolian  tone  is  generated  when  a  stream  of  air  flows 
around  a  narrow  obstacle,  such  as  a  wire  or  a  cylinder.  The  stream  of  air  may 
be  wide,  or  it  may  be  very  narrow  so  that  it  can  be  viewed  as  a  jet  of  air. 
Morse&Ingard  [33,  p.751]  provide  experimental  formulas  for  the  frequencies  of 
Aeolian  tones  as  a  function  of  air  speed  and  wire  diameter.  A  related  musical 
instrument  is  the  Aeolian  harp  which  consists  of  a  set  of  strings  that  vibrate 
when  the  air  is  blown  against  them. 

•  In  sound  phenomena  such  as  Aeolian  tones,  the  acoustic  and  the  hydrodynamic 
oscillations  of  the  air  typically  trigger  vibrations  of  the  wire  so  that  there  is  a 
coupling  between  acoustic,  hydrodynamic,  and  solid-obstacle  oscillations.  This 
coupling  amplihes  the  resonant  frequencies  of  the  wire,  and  sometimes  it  even 
leads  to  disaster  when  there  is  not  sufficient  damping  of  the  solid-object  vibra¬ 
tions.  The  collapse  of  the  Tacoma  bridge  and  the  collapse  of  industrial  chimneys 
(Tritton  [54,  p.444])  are  famous  examples. 

•  Reed  musical  instruments  such  as  the  clarinet  and  the  harmonica  also  exploit 
the  vibrations  of  solid  obstacles.  It  should  be  noted  however  that  a  vibrating 
reed  is  somewhat  different  from  a  vibrating  wire  because  the  reed  vibrations 
open  and  close  periodically  a  narrow  opening  through  which  the  air  passes. 

The  above  catalogue  describes  some  representative  examples  of  flow-generated 
sound  phenomena.  Many  other  possibilities  and  variations  of  the  above  are  certainly 
possible.  Below,  the  operation  of  flue  pipes  is  considered  further. 
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7.2  The  operation  of  flue  pipes 

The  operation  of  flue  pipes  has  been  studied  for  hundreds  of  years.  Considerable 
progress  has  been  made,  but  important  basic  questions  remain  unanswered.  For 
example,  a  most  basic  question  is  whether  a  given  geometry  (flue  channel,  labium, 
and  pipe)  will  produce  audible  tones.  Anyone  who  has  experimented  with  building 
new  kinds  of  flue  pipes  knows  very  well  that  small  changes  in  the  geometry  can  make 
a  flue  pipe  sing,  make  no  sound  at  all,  or  make  a  very  mediocre  sound  (noisy,  hissing, 
or  including  intermittent  vibrations  and  beats).  Presently,  the  existing  theories  of 
flue  pipes  can  not  answer  questions  such  as  whether  a  given  flue  pipe  will  sing  or  not. 

The  existing  theories  of  flue  pipes  try  to  reduce  the  complexities  of  the  fluid 
dynamics  inside  a  flue  pipe  to  a  system  of  lumped  components  such  as  oscillators  (in¬ 
ductors,  capacitors),  dampers,  and  amplihers.  Such  a  reduction  introduces  a  number 
of  parameters  which  are  adjusted  to  ht  the  observed  results  of  a  particular  flue  pipe 
(Verge94  [57,  56],  Hirschberg  [26]).  Considerable  success  has  been  reported  with  some 
reduced  models  of  flue  pipes,  but  the  subject  still  has  a  long  way  to  go.  For  example, 
the  assumptions  of  the  reduced  models  are  not  agreed  upon  by  everyone,  and  they 
are  not  completely  understood.  Furthermore,  hnding  reduced  models  of  flue  pipes  is 
somewhat  of  an  art.  It  is  not  clear  what  approximations  can  be  made  when  a  new 
flue  pipe  of  different  geometry  is  considered. 

The  details  of  existing  theories  of  flue  pipes  will  not  be  discussed  here.  However, 
there  are  a  few  basic  principles  that  are  worth  reviewing.  First,  it  is  assumed  that 
there  is  some  kind  of  feedback  between  the  acoustic  waves  in  the  pipe  and  the  jet. 
This  feedback  is  responsible  for  amplifying  the  acoustic  waves  under  appropriate 
conditions.  Second,  it  is  recognized  that  there  are,  at  least,  two  major  types  of 
feedback:  hydrodynamic  and  acoustic.  The  hydrodynamic  feedback  refers  to  the 
interaction  between  the  jet  of  air  and  the  labium,  and  includes  the  shedding  of  vortices 
by  the  jet,  and  the  local  pressure  gradients  which  have  an  immediate  effect  on  the 
jet.  The  acoustic  feedback  refers  to  the  pressure  disturbances  (traveling  waves)  which 
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emanate  from  the  jet-labium  region,  travel  down  the  pipe,  reflect,  and  return  back 
to  the  jet-labium  region  after  a  considerable  delay.  The  major  distinction  between 
acoustic  and  hydrodynamic  feedback  is  the  time  delay  of  traveling  waves  versus  the 
almost-zero  delay  of  hydrodynamic  effects. 

The  distinction  between  the  hydrodynamic  and  the  acoustic  feedback  is  closely  re¬ 
lated  to  the  distinction  between  an  “edge  tone”  and  a  “pipe  tone” .  The  former  refers 
to  the  oscillations  of  a  jet  impinging  a  sharp  obstacle  without  any  resonant  cavity  in 
the  vicinity.  The  latter,  the  pipe  tone,  refers  to  the  normal  operation  of  a  flue  pipe, 
where  sound  is  generated  by  a  jet  of  air  impinging  a  sharp  edge  near  a  resonant  pipe. 
It  is  clear  that  in  the  edge  tone  there  is  no  reflection  of  acoustic  waves  (no  delayed 
feedback)  which  means  that  the  edge  tone  is  a  purely  hydrodynamic  phenomenon  by 
dehnition.  Furthermore,  the  frequencies  of  an  edge  tone  are  approximately  propor¬ 
tional  to  the  blowing  speed,  and  inversely  proportional  to  the  distance  between  the 
jet’s  orihce  and  the  obstacle.  An  experimental  formula  is  as  follows  (Hirschberg  [26, 
P-210]), 

fW 

^  =  0.4(n  +  7)  n  =  l,2,...  (7.1) 

where  /  is  the  frequency  in  Hz,  V  is  the  mean  speed  of  the  jet  in  cm/s,  and  7  is  a  small 
correction  0  <  7  <  0.5.  By  contrast,  the  frequencies  of  a  pipe  tone  do  not  vary  much 
with  the  blowing  speed  (except  for  jumping  to  higher  modes),  and  are  determined 
mostly  by  the  acoustic  feedback  and  the  dimensions  of  the  resonant  pipe.  As  the 
blowing  speed  increases,  the  pipe-tone  frequencies  stay  approximately  hxed  until  at 
some  point  the  frequencies  “jump”  to  higher  values  which  are  near  higher  resonant 
modes  of  the  pipe.  This  is,  of  course,  a  simplihed  picture.  In  practice,  low-frequency 
beats,  hissing  sound,  and  failure  to  sing  may  also  occur  as  the  blowing  conditions  are 
varied. 

In  comparing  edge  tones  and  pipe  tones,  it  should  be  noted  that  an  edge  tone  often 
does  not  generate  enough  acoustic  energy  to  be  audible.  Generally,  an  edge  tone  is 
weaker  than  a  pipe  tone  because  there  is  no  resonant  cavity  to  amplify  the  sound.  A 
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related  issue  is  that  when  a  flue  pipe  stops  singing,  the  jet  of  air  often  continues  to 
oscillate.  Perhaps,  this  is  a  type  of  edge  tone  where  the  acoustic  coupling  between 
the  jet  and  the  resonant  cavity  fails  for  some  reason,  and  the  hydrodynamic  effects 
play  a  dominant  role  in  the  jet’s  oscillations,  but  do  not  generate  enough  acoustic 
energy  to  be  audible  (hgure  7-9  shows  a  simulation  where  this  phenomenon  may  be 
occurring). 

The  details  of  the  hydrodynamic  and  the  acoustic  feedback  are  still  a  subject  of 
research.  A  currently  popular  model  of  the  acoustic  feedback  is  to  assume  that  the 
jet  behaves  as  if  it  were  inhnitely  long,  and  that  the  acoustic  waves  inside  the  pipe 
perturb  the  jet  as  it  emerges  from  the  flue  channel.  As  the  jet  undulates,  it  amplihes 
the  perturbations,  and  returns  acoustic  energy  into  the  pipe.  This  model  is  based  on 
the  work  of  Rayleigh  (J.W.  Strutt)  on  inhnitely  long  jets.  Although  there  is  some 
truth  to  this  model,  the  actual  jet  inside  a  hue  pipe  is  nothing  but  inhnite.  The  jet 
is  short  and  rather  unpredictable.  Sometimes,  the  jet  extends  undulating  all  the  way 
from  the  orihce  to  the  labium,  and  other  times,  the  jet  breaks  well-before  reaching 
the  tip  of  the  labium.  Perhaps,  different  reduced  models  of  the  jet  are  needed  to 
characterize  different  behaviors. 

Some  factors  which  control  the  operation  of  a  hue  pipe,  what  frequencies  are 
generated,  and  how  well  the  hue  pipe  sings  are  listed  below. 

•  The  blowing  speed  of  the  jet. 

•  The  initial  blow  of  air  into  the  pipe  that  triggers  the  oscillations. 

•  The  orihce-to-labium  distance. 

•  The  alignment  of  the  labium  with  the  hue  channel,  and  also  the  alignment  of 
the  labium  with  the  resonant  pipe. 


•  The  length  of  the  resonant  pipe,  as  well  as  the  width  (and  depth)  of  the  pipe. 
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•  The  conditions  outside  the  pipe  and  especially  above  the  labium.  For  example, 
an  inhnite  region  above  the  labium,  stagnant  air,  and  constant  ambient  pressure 
seem  to  help  the  operation  of  the  flue  pipe.  By  contrast,  a  limited  region 
above  the  labium,  accumulation  of  vorticity,  and  buildup  of  pressure  gradients 
complicate  the  operation  of  the  flue  pipe. 

The  last  one  of  the  above  conditions  has  already  been  mentioned  in  section  1.4  as  a 
possible  cause  for  the  differences  between  the  computer  simulations  and  the  physical 
measurements  of  the  20  cm  closed-end  soprano  recorder.  Below,  the  simulation  of 
flue  pipes  is  discussed  further. 

7.3  Inlet  and  outlet  boundary  conditions 

In  this  section,  suitable  boundary  conditions  for  modeling  the  inlet  and  the  outlet  in 
simulations  of  flue  pipes  are  described.  The  same  approach  applies  both  to  the  lattice 
Boltzmann  method  and  the  compressible  hnite  difference  method  of  section  3.3. 

The  boundary  conditions  at  the  inlet  and  the  outlet  must  ensure  that  a  prescribed 
flow  of  air  enters  and  exits  the  simulated  region.  Furthermore,  the  boundary  condi¬ 
tions  at  the  inlet  and  the  outlet  must  avoid  the  reflection  of  acoustic  waves,  if  possible. 
This  is  an  important  issue  in  modeling  flue  pipes  because  the  region  above  the  labium 
should  approximate  as  much  as  possible  an  inhnite  region,  not  a  resonant  cavity. 

A  simple  technique  for  non-rehecting  (absorbing)  boundary  conditions  can  be  de¬ 
vised  as  follows.  We  observe  that  in  compressible  how,  the  propagation  of  acoustic 
waves  occurs  by  interchanging  the  acoustic  energy  between  two  forms,  kinetic  (veloc¬ 
ity)  and  potential  (density).  If  either  the  velocity  or  the  density  is  “clamped”  down 
at  a  point,  acoustic  rehection  occurs  at  that  point.  If  both  the  velocity  and  the  den¬ 
sity  are  free  to  vary  (as  in  free  space),  the  acoustic  wave  propagates  freely  without 
rehections.  If  both  the  velocity  and  the  density  are  “clamped”  down,  the  acoustic 
wave  is  absorbed,  and  there  are  no  rehections. 
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The  above  rules  for  the  reflection  of  acoustic  waves  can  be  verihed  by  considering 
a  few  simple  cases.  An  example  where  the  velocity  is  clamped  and  the  density  is  free, 
is  a  non-slip  wall.  As  a  traveling  wave  reaches  the  wall,  the  acoustic  velocity  must 
vanish,  which  causes  the  density  to  build  up  at  the  wall,  and  subsequently  creates 
a  traveling  wave  in  the  opposite  direction  (the  reflection).  If  the  traveling  wave  is  a 
pulse  of  positive  density,  so  is  the  reflected  wave;  in  other  words,  the  phase  of  the 
acoustic  wave  is  preserved  after  a  wall  reflection. 

By  contrast,  when  the  velocity  is  free  and  the  density  is  clamped,  the  phase  of 
the  traveling  wave  is  reversed.  An  example  is  the  reflection  at  the  end  of  an  open 
pipe;  namely,  a  pipe  which  opens  into  inhnite  space.  In  this  case,  the  density  is  held 
approximately  constant  (ambient  atmospheric  pressure),  and  the  velocity  varies.  As 
the  traveling  wave  reaches  the  opening,  the  density  pulse  (let  us  assume  a  positive 
pulse)  must  vanish,  which  causes  the  acoustic  velocity  to  increase  further  (the  po¬ 
tential  energy  becomes  kinetic)  until  eventually  a  negative  pulse  of  density  is  created 
which  travels  backwards  (the  reflection). 

The  above  rules  describe  what  happens  in  the  physical  world.  Similar  rules  to  the 
above  can  be  applied  in  a  numerical  simulation  of  compressible  flow. 


25 

70 

75 

40 

13.4  200 

Figure  7-1:  Soprano  recorder  flue,  20  cm  closed-end  pipe.  The  numbers  shown  corre¬ 
spond  to  millimeters.  Inlet  is  at  the  left,  outlet  is  at  the  top  of  the  picture. 


For  example,  in  the  simulation  of  a  closed-end  flue  pipe,  both  the  pressure  and 
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the  velocity  are  prescribed  at  the  inlet  and  the  outlet  so  as  to  avoid  the  reflection 
of  acoustic  waves.  In  particular,  the  pressure  is  set  equal  to  zero  at  the  outlet,  and 
equal  to  an  estimated  pressure  drop  at  the  inlet.  Figure  7-1  shows  a  typical  geometry 
of  such  a  flue  pipe  simulation.  The  inlet  is  located  at  the  narrow  opening  of  the 
flue  channel  at  the  left  side,  and  the  outlet  is  located  at  the  top  of  the  picture.  It 
must  be  noted  that  the  pressure  drop  is  not  known  a  priori  because  it  depends  on 
the  imposed  flow  of  air,  and  on  the  dynamical  behavior  of  the  system.  Thus,  the 
imposed  pressure  drop  must  be  an  approximation.  Such  an  approach  is  successful 
in  preventing  the  reflection  of  acoustic  waves,  ^  but  raises  the  question  whether  it  is 
consistent  to  specify  both  the  velocity  and  the  pressure  at  the  inlet  and  the  outlet. 

To  answer  the  above  question,  let  us  consider  the  case  of  Hagen-Poiseuille  flow 
through  a  long  pipe.  When  a  pressure  drop  is  imposed  between  the  inlet  and  the 
outlet,  a  flow  develops  through  the  pipe.  When  a  flow  is  imposed  through  the  pipe, 
a  pressure  drop  develops.  When  both  a  flow  and  a  pressure  drop  are  imposed,  a  flow 
develops  which  is  higher  than  the  imposed  flow  if  the  imposed  pressure  drop  is  an 
overestimate  of  the  pressure  drop  corresponding  to  the  imposed  flow;  and  conversely 
if  the  imposed  pressure  drop  is  an  underestimate.  This  behavior  is  easily  verihed 
in  simulations  of  Hagen-Poiseuille  flow  and  also  in  simulations  of  flue  pipes  using 
the  lattice  Boltzmann  and  the  compressible  hnite  difference  method  (hgures  7-lA 
and  7-lB). 

Table  7.1  shows  the  imposed  velocity  and  the  actual  flow  through  the  flue  channel 
in  simulations  of  the  20  cm  closed-end  recorder.  The  prohle  of  the  imposed  velocity  is 

^Another  issue  which  relates  to  hydrodynamics  as  opposed  to  acoustics  is  the  “reflection”  of 
vortices  reaching  the  outlet.  In  particular,  vortices  are  generated  at  the  labium  of  the  flue  pipe,  and 
eventually  reach  the  outlet  if  the  simulation  continues  long  enough.  When  this  happens,  the  vortices 
do  not  simply  cross  the  outlet  and  leave  the  simulated  region.  Instead,  the  vortices  reach  the  outlet, 
try  to  leave  the  simulated  region,  and  then  bounce  back  into  the  simulated  region.  The  accumulation 
of  vorticity  in  the  simulated  region  creates  problems  because  it  changes  the  nature  of  the  problem 
being  simulated.  This  issue  is  avoided  in  the  present  simulations  by  making  the  simulated  region 
large  enough  that  the  vortices  generated  at  the  labium  do  not  reach  the  outlet  during  the  simulation. 
Better  boundary  conditions  or  some  way  to  dissipate  the  vorticity  before  reaching  the  outlet,  must 
be  devised  in  order  to  continue  the  simulations  indehnitely. 
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Lattice  Boltzmann 
imposed  V 
actual  V 

800 

818 

1080 

1104 

1500 

1535 

1959 

1995 

Finite  Differences 
imposed  V 

800 

1060 

1558 

1985 

actual  V 

838 

1113 

1634 

2082 

Table  7.1:  Imposed  velocity  and  actual  flow  through  the  flue  channel  in  flue  pipe 
simulations.  The  velocity  V  is  in  cm/s. 


channel  length 

Figure  7-lA:  Pressure  drop  and  flow  speed  through  a  channel.  Overimposed  pressure 
boundary  conditions  between  inlet  and  outlet.  Compressible  hnite  difference  method. 


parabolic  both  at  the  inlet  and  the  outlet  (the  total  flux  at  the  outlet  is  set  equal  to 
the  total  flux  at  the  inlet).  The  actual  flow  through  the  flue  channel  is  measured  by 
sampling  midway  along  the  width  of  the  flue  channel  and  time-averaging.  The  velocity 
profile  inside  the  channel  is  parabolic  so  the  horizontal  velocity  at  the  midpoint  is 
scaled  by  2/3  to  calculate  the  mean  speed  shown  in  table  7.1.  We  can  see  that  the 
actual  flow  is  always  larger  than  the  imposed  flow.  This  is  because  the  imposed 
pressure  drop  is  an  overestimate  of  the  pressure  drop  corresponding  to  the  imposed 
flow  through  a  channel  0.1  cm  wide  and  4  cm  long.  Specifically,  the  imposed  pressure 
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speed 


channel  length 


Figure  7-lB:  Pressure  drop  and  flow  speed  through  a  channel.  Overimposed  pressure 
boundary  conditions  between  inlet  and  outlet.  Lattice  Boltzmann  method  using 
second-order  differences  at  the  boundary. 


drop  is  equal  to  the  Hagen-Poiseuille  pressure  drop  of  a  channel  whose  length  is  3.5 
times  the  length  of  the  flue  channel;  namely, 

—  =  AL^V  =  (3.5  X  4.0)  X  ^  x  =  2500  x  (7.2) 

Po  u  0.1^ 

where  V  is  the  mean  velocity,  AP  is  the  pressure  drop  in  gm/(cms^),  po  is  the  mean 
density  of  air,  and  d  and  AL  represent  the  channel’s  width  and  length. 

The  above  pressure  drop  is  much  larger  than  necessary.  The  actual  pressure  drop 
between  the  inlet  and  the  outlet  of  the  simulations  of  flue  pipes  is  dominated  by 
the  pressure  drop  along  the  narrow  flue  channel.  Thus,  it  would  suffice  to  impose  a 
pressure  drop  equal  to  the  Hagen-Poiseuille  pressure  drop  of  the  flue  channel.  Due 
to  an  oversight  (see  footnote  of  page  225B),  the  pressure  drop  was  imposed  3.5  times 
larger  than  necessary.  However,  an  overestimated  pressure  drop  does  not  cause  any 
serious  problems  in  the  simulations;  it  only  produces  a  slightly  larger  flow  than  the 
imposed  flow  as  shown  in  table  7.1  and  figures  7-lA  and  7-lB.  In  general,  only  an 
order-of-magnitude  estimate  of  the  pressure  drop  is  needed. 
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Figure  7-lC:  Pressure  drop  and  flow  speed  through  a  channel, 
method  using  first-order  differences  at  the  boundary. 


Lattice  Boltzmann 


Figures  7-lA  to  7-lC  show  the  pressure  drop  and  flow  speed  during  steady  state 
in  simulations  of  a  channel  which  is  0.1  cm  wide  and  4  cm  long.  Both  the  density  and 
velocity  are  imposed  at  the  inlet  and  outlet,  and  the  walls  are  non-slip.  The  setup 
is  similar  to  the  simulations  of  flue  pipes  except  that  only  the  channel  is  considered 
here  for  simplicity.  The  grid  is  401  X  11.  The  flow  speed  in  the  figures  is  expressed 
in  cm/s  and  the  pressure  drop  in  [p  —  po)  j po  where  po  is  the  mean  density  of  air, 
and  Cg  is  the  speed  of  sound.  Both  the  pressure  and  the  flow  speed  are  sampled  at 
the  midpoint  and  along  the  length  of  the  flue  channel  (the  speed  at  the  midpoint  is 
scaled  by  2/3  to  calculate  the  mean  speed  because  the  velocity  profile  is  parabolic). 

Figure  7-lA  corresponds  to  the  compressible  finite  difference  method  using  first- 
order  differences  at  the  boundary  (section  3.3.4);  and  figures  7-lB  and  7-lC  corre¬ 
spond  to  the  lattice  Boltzmann  method  using  second-order  differences  and  first-order 
differences  at  the  boundary  respectively.  In  figures  7-1 A  and  7- IB  we  can  see  that 
imposing  a  larger-than-necessary  pressure  drop  between  the  inlet  and  outlet,  simply 
shifts  the  pressure  held  upwards  (the  curve  becomes  centered  between  the  imposed 
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pressure  values).  Further,  we  can  see  that  a  larger-than-necessary  pressure  drop 
causes  a  slight  increase  of  the  flow  through  the  channel.  The  change  from  the  im¬ 
posed  boundary  condition  to  the  flow  behavior  inside  the  channel  includes  a  ringing 
effect  which  is  more  noticeable  in  the  case  of  the  compressible  hnite  difference  method. 

In  hgure  7-lC,  we  see  that  the  lattice  Boltzmann  method  using  hrst-order  dif¬ 
ferences  at  the  boundary  predicts  a  very  different  pressure  drop  than  the  results  of 
hgures  7-1 A  and  7- IB.  In  fact,  the  pressure-gradient  slope  of  hgure  7- 1C  is  3  times 
larger  than  the  pressure-gradient  slopes  of  hgures  7-1 A  and  7- IB.  It  turns  out  that 
the  lattice  Boltzmann  method  using  hrst-order  differences  at  the  boundary  is  very 
inaccurate  with  regard  to  the  pressure  drop,  and  overestimates  the  pressure  drop  by 
a  factor  of  3  at  the  present  resolution  of  10  huid  nodes  per  width  of  the  channel.  By 
contrast,  the  pressure  drop  of  hgures  7-lA  and  7-lB  agrees  within  2  decimal  digits 
with  the  correct  value  of  Hagen-Poiseuille  how.  ^ 

An  important  fact  to  mention  is  that  I  discovered  the  inaccurate  prediction  of 
the  pressure  drop  by  the  lattice  Boltzmann  method  using  hrst-order  differences  at 
the  boundary,  after  most  of  the  simulations  presented  in  my  thesis  had  already  been 
performed.  ^  Fortunately,  the  lattice  Boltzmann  simulations  using  hrst-order  and 
second-order  differences  at  the  boundary  do  not  differ  greatly  with  regard  to  the 
operation  of  the  hue  pipe;  they  only  differ  with  regard  to  the  pressure  drop  inside  the 
hue  channel.  This  fact  was  checked  for  a  number  of  different  simulations.  Because  of 
this  fact  and  because  of  lack  of  time,  the  lattice  Boltzmann  simulations  which  were 
performed  using  hrst-order  differences,  have  not  been  repeated  using  second-order 
differences  at  the  boundary.  Of  course,  second-order  differences  at  the  boundary  are 
recommended  and  should  be  used  in  the  future. 


^An  explanation  of  the  large  error  in  pressure  drop  by  the  lattice  Boltzmann  method  when  using 
hrst-order  differences  at  the  boundary  must  involve  the  Chapman-Enskog  expansion  of  the  extended 
collision  operator,  and  is  left  for  future  work. 

^The  overestimated  pressure  drop  has  been  used  as  a  boundary  condition  for  all  the  simula¬ 
tions  (lattice  Boltzmann  method  using  hrst-order  and  second-order  differences  at  the  boundary,  and 
compressible  hnite  difference  method). 
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7.3.1  The  end-correction  of  an  open-end  pipe 

The  rules  mentioned  in  the  previous  section  for  the  reflection  of  acoustic  waves  can  be 
used  to  model  an  open-end  pipe.  Normally,  an  open-end  pipe  requires  the  simulation 
of  a  very  large  region  connected  to  the  outside  of  the  open-end  pipe.  To  save  on 
computational  effort,  a  shortcut  can  be  made  by  imposing  a  hxed  ambient  pressure 
at  the  end  of  the  pipe,  and  calculating  the  velocity  via  extrapolation.  This  approach 
reflects  acoustic  waves  in  a  similar  way  that  a  physical  open-end  pipe  does.  Section  7.5 
presents  simulations  of  an  open-end  soprano  recorder  using  this  approach. 

An  issue  with  the  above  approach  is  the  end-correction  of  an  open-end  pipe 
(Rayleigh  [42,  p.287],  Olson  [36,  p.84]).  In  the  physical  world,  the  point  where  the 
pressure  equals  the  ambient  pressure  is  not  exactly  the  end  of  the  pipe,  but  varies 
depending  on  the  diameter  of  the  pipe  and  possibly  on  other  factors  as  well.  A  re¬ 
lated  issue  is  that  a  specihc  amount  of  acoustic  energy  is  radiated  outwards  during 
reflection  from  an  open-end.  This  loss  of  acoustic  energy  may  differ  between  the 
physical  world  and  the  simple  model  of  clamping  the  pressure  and  extrapolating  the 
velocity.  These  are  some  of  the  difficulties  which  make  the  modeling  of  an  open-end 
pipe  more  difficult  than  the  modeling  of  a  closed-end  pipe,  and  should  be  addressed 
in  the  future. 


7.3.2  Smooth  rise  at  startup 


During  the  initial  blowing  of  the  air  into  the  flue  channel,  the  imposed  density  and 
velocity  at  the  inlet  rise  smoothly  to  hnal  values  within  a  specihed  time  interval.  The 
following  formula  is  used  both  for  the  velocity  and  the  density, 

\21 


m  =  I'hnal 


1  -  10- (W 


(7,3) 


where  T  is  the  rise  time  it  takes  to  reach  90%  of  the  hnal  value.  A  rise  time  of  3  ms 
is  used  in  all  the  simulations  presented  here,  which  is  relatively  fast  but  not  unusual 
(Verge94  [57,  56],  Hirschberg  [26]). 
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Figure  7-2:  The  rise  of  density  and  velocity  inside  the  flue  channel  at  startup. 


Figure  7-2  shows  the  rise  of  the  density  and  the  velocity  (x-component  of  velocity) 
inside  the  flue  channel  during  the  initial  blowing  of  air.  These  signals  are  obtained 
from  a  lattice  Boltzmann  simulation  of  a  closed-end  recorder  with  a  mean  blowing 
speed  1104  cm/s  (same  as  hgure  1-16).  The  signals  are  sampled  at  the  midpoint 
inside  the  flue  channel  (maximum  flow  velocity)  and  at  distance  0.961  cm  from  the 
inlet.  The  density  (shown  as  p' j po)  rises  at  a  faster  rate  than  the  velocity  because  the 
flow  creates  additional  pressure  during  startup.  The  additional  pressure  is  a  reaction 
of  the  stagnant  air  inside  the  channel  to  the  incoming  flow.  After  a  time  interval  of 
20  X  0.206  ms,  both  the  pressure  and  the  velocity  reach  hnal  values  approximately. 
After  40x0.206  ms,  the  onset  of  periodic  acoustic  oscillations  can  be  seen  as  well.  The 
acoustic  oscillations  are  generated  at  the  flue-labium  region,  and  travel  backwards  into 
the  flue  channel. 
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7.4  Closed-end  soprano  recorder 

This  section  presents  further  results  on  the  simulations  of  the  closed-end  soprano 
recorder  described  in  section  1.4. 

It  is  interesting  that  if  we  sample  the  acoustic  signal  (density  variations)  below 
the  labium,  the  fundamental  mode  is  strongly  diminished  compared  to  sampling  the 
radiated  signal  outside  the  pipe.  Most-likely,  this  is  because  the  open  end  (flue-labium 
region)  acts  as  a  node  for  density  oscillations,  and  an  anti-node  (loop)  for  velocity 
oscillations.  To  be  precise,  the  flue-labium  region  is  actually  driving  the  oscillations, 
and  thus  it  is  somewhat  different  from  an  exact  node  of  a  passive  pipe.  Nevertheless, 
the  flue-labium  region  behaves  very  much  as  an  open  end,  and  as  a  density  node  for  the 
fundamental  frequency.  The  effect  can  be  observed  in  the  computer  simulations  both 
for  a  closed-end  recorder  (open-closed  pipe)  and  for  an  open-end  recorder  (open-open 
pipe)  described  in  section  7.5. 

Figures  7-6  and  7-7  show  the  acoustic  signal  (density  variations)  from  the  lattice 
Boltzmann  simulation  of  a  20  cm  closed-end  recorder  at  blowing  speed  1535  cm/s 
(same  plotting  conventions  as  in  section  1.4,  hgure  7-6  is  identical  to  hgure  1-17).  Two 
different  sampling  locations  are  examined:  the  top  graphs  show  the  signal  outside  the 
pipe  and  about  5  cm  above  the  labium,  the  bottom  graphs  show  the  signal  inside  the 
pipe  and  1.34  cm  below  the  labium  (right  on  the  bottom  wall  and  0.316  cm  forwards 
in  the  horizontal  direction  from  the  flue  orihce).  We  can  see  that  the  fundamental 
mode  of  400  Hz  is  diminished  in  the  bottom  graphs  where  the  signal  is  sampled  below 
the  labium. 

Another  interesting  observation  regarding  the  signals  sampled  outside  and  inside 
a  pipe  can  be  made  in  the  case  of  blowing  speed  818  cm/s.  This  is  a  situation  where 
the  simulated  20  cm  closed-end  recorder  fails  to  sing,  probably  because  the  outlet 
region  above  the  labium  is  small  and  conhned  versus  inhnite  in  the  physical  world  as 
explained  in  section  1.4.  The  signals  sampled  outside  and  inside  the  pipe  for  blowing 
speed  818  cm/s  are  shown  in  hgures  7-8  and  7-9  respectively  (hgure  7-8  is  the  same 
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as  figure  1-15  except  for  a  longer  interval  of  time).  We  can  see  that  the  density 
oscillations  outside  the  pipe  diminish  quickly  after  100  X  0.206  ms.  However,  there 
are  periodic  density  oscillations  inside  the  pipe  at  the  frequency  of  820  Hz. 

The  oscillations  at  frequency  820  Hz  are  most  likely  edge  tones  (hydrodynamic 
oscillations  of  the  jet  of  air  impinging  the  labium).  It  appears  that  the  acoustic  cou¬ 
pling  between  the  jet  and  the  pipe  breaks  down,  and  there  is  no  strong  amplihcation 
of  sound.  The  frequency  of  an  edge  tone  is  proportional  to  the  blowing  speed  approx¬ 
imately.  Using  the  experimental  equation  7.1  for  edge  tones,  and  putting  W  =  0.4  cm 
for  the  distance  between  the  orihce  and  the  labium,  we  hnd  f /V  ~  1  cm  which  agrees 
with  /  ~  820  Hz  and  V  =  818  cm/s. 

Another  way  of  examining  the  oscillations  at  frequency  820  Hz  is  shown  in  hgure  7- 
4  which  plots  iso-vorticity  contours  of  the  flue-labium  region  at  38.2  ms  after  startup 
(blowing  speed  818  cm/s).  We  can  see  that  the  jet  oscillates  at  blowing  speed  818  cm/s 
even  though  little  acoustic  sound  is  produced  by  the  recorder.  However,  the  jet 
oscillations  are  relatively  small  compared  to  other  situations  when  there  is  a  strong 
acoustic  signal.  To  compare,  hgure  7-5  shows  the  jet  oscillations  at  blowing  velocity 
1104  cm/s  and  34.7  ms  after  startup.  Now,  the  jet  oscillations  are  much  larger  than 
hgure  7-4,  and  the  vortices  do  not  align  themselves  into  a  stream  of  vortices  above 
the  labium.  The  formation  of  a  stream  of  vortices  at  blowing  speed  818  cm/s  is 
most-likely  related  to  the  small  blowing  speed  and  the  absence  of  strong  acoustic 
oscillations. 

Figure  7-3  shows  the  jet  oscillations  of  the  20  cm  closed-end  recorder  at  blowing 
speed  818  cm/s  and  11.7  ms  after  startup.  The  acoustic  signal  is  still  strong  at  this 
time,  the  jet  oscillations  are  large,  and  the  shed  vortices  are  not  aligned  into  a  stream 
of  vortices.  This  happens  later,  approximately  20  ms  after  startup. 
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Figure  7-3:  Simulation  of  20  cm  clos 
speed  818  cm/s,  iso-vorticity  contour 


Figure  7-5:  Simulation  of  20  cm  closed-end  recorder,  34.7  ms  after  startup,  blowing 
speed  1104  cm/s,  iso-vorticity  contours.  The  jet  oscillations  are  large  and  produce 
strong  acoustic  waves. 
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Figure  7-6:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  1535  cm/s,  sampled  5  cm  above  labium. 


Figure  7-7:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  1535  cm/s,  sampled  1.34  cm  below  labium. 
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Figure  7-8:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  818  cm/s,  sampled  5  cm  above  labium. 


Figure  7-9:  Lattice  Boltzmann  method,  20  cm  closed-end  soprano  recorder,  blowing 
velocity  818  cm/s,  sampled  1.34  cm  below  labium.  An  edge  tone  perhaps  occurs  here. 
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7.5  Open-end  soprano  recorder 

An  open-end  version  of  the  soprano  recorder  is  examined  here.  The  geometry  is  the 
same  as  the  one  described  in  section  1.4  except  for  one  difference.  Here,  the  head 
of  the  recorder  is  connected  to  a  pipe  which  is  open  at  the  distant  end.  Also,  the 
total  length  of  the  pipe  (including  the  head  of  the  recorder)  is  chosen  to  be  22  cm 
in  the  present  experiments.  The  frequencies  generated  by  the  open-end  recorder  are 
expected  to  be  in  ratios  of  1  :  2  :  3  :  4  in  contrast  to  the  ratios  1  :  3  :  5  :  7  for  a 
closed-end  recorder.  An  open-end  recorder  behaves  like  an  open-open  pipe  because 
there  is  one  opening  above  the  labium  and  another  opening  at  the  far  end  of  the  pipe. 
The  computer  simulations  of  the  22  cm  open-end  recorder  conhrm  this  behavior  as 
we  shall  see  below. 

The  boundary  conditions  at  the  open-end  pipe  are  set  according  the  scheme  de¬ 
scribed  in  section  7.3;  namely,  the  density  is  held  constant  (ambient  pressure),  and 
the  velocity  is  extrapolated  from  the  nearest  neighboring  node  in  the  horizontal  di¬ 
rection  (normal  to  the  open  end).  The  boundary  conditions  at  the  inlet  (flue  channel) 
and  the  outlet  (above  the  labium)  are  set  in  the  same  way  as  for  a  closed-end  pipe; 
namely,  both  the  density  and  the  incoming/outgoing  velocity  are  imposed. 

A  complication  arises  with  the  balance  of  incoming  flow  and  outgoing  flow  because 
there  is  outgoing  flow  both  through  the  top  outlet  and  through  the  open-end  pipe. 
In  the  present  simulations  using  the  lattice  Boltzmann  method  (hgures  7-13  and  7- 
14),  the  imposed  outgoing  flow  at  the  top  outlet  has  been  set  equal  to  the  imposed 
incoming  flow  at  the  flue  inlet.  However,  the  imposed  pressure  drop  has  been  set  large 
enough  that  the  actual  incoming  flow  through  the  flue  channel  is  signihcantly  larger 
than  the  imposed  flow  (similar  idea  as  in  table  7.1  of  section  7.3).  This  produces 
adequate  incoming  flow  to  balance  both  the  flow  through  the  top  outlet  and  the  flow 
through  the  open-end  pipe.  Experimentally,  it  has  been  measured  that  the  time- 
average  flow  of  air  through  the  open-end  pipe  is  about  1/10  of  the  incoming  flow 
through  the  flue  channel,  and  that  the  remaining  9/10  of  the  incoming  flow  exits 
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through  the  top  outlet.  In  future  simulations,  it  would  be  a  good  idea  to  set  the 
imposed  inflow  at  the  inlet  proportional  to  10/10,  and  the  imposed  outflow  at  the 
outlet  proportional  to  9/10,  so  that  1/10  is  left  to  flow  through  the  open-end  pipe. 

Figures  7-13  and  7-14  show  the  acoustic  signal  from  simulations  of  the  22  cm 
open-end  recorder  sampled  outside  and  inside  the  pipe.  The  major  frequencies  are 
summarized  in  table  7.2.  For  comparison,  physical  measurements  of  the  acoustic 
signal  of  a  22  cm  open-end  recorder  are  shown  in  hgures  7-15  and  7-16  and  table  7.3. 
Table  7.4  lists  the  ideal  frequencies  of  a  passive  pipe  which  is  22  cm  long.  The  blowing 
velocity  was  not  measured  in  the  physical  experiments,  but  it  is  estimated  that  the 
velocity  was  on  the  order  of  1000-1500  cm/s  (a  human  subject  blew  the  recorder  in 
these  measurements). 

Figure  7-10  plots  iso-vorticity  contours  of  the  flue-labium  region  38.2  ms  after 
startup  for  the  22  cm  open-end  recorder  at  blowing  speed  1197  cm/s.  Comparing  this 
hgure  against  hgure  7-5  of  a  closed-end  recorder,  we  see  that  the  oscillations  of  the 
jet  extend  inside  the  pipe  in  the  case  of  an  open-end  recorder.  Furthermore,  large 
vortices  are  shed  inside  the  pipe  (below  the  labium)  as  well  as  outside  the  pipe.  This 
behavior  can  also  be  seen  in  hgures  7-11  and  7-12  which  show  a  sequence  of  frames  of 
the  hue-labium  region  at  29.5  ms  after  startup.  The  frames  are  0.2169  ms  apart.  The 
top  hgure  shows  the  velocity  vector  held,  and  the  bottom  hgure  shows  iso-vorticity 
contours.  ® 


■^In  the  present  simulation  of  the  22  cm  open-end  recorder,  the  imposed  influx  is  1080  cm/s  and 
imposed  pressure  drop  is  6.48  x  10®  gm/(cms^)  divided  by  the  mean  density  of  air.  The  resulting 
incoming  flow  is  1197  cm/s,  and  the  resulting  pressure  drop  is  approximately  2.55  xlO®  in  the  same 
units  as  above.  The  resulting  pressure  drop  is  measured  by  examining  the  time-average  density  at 
points  near  the  inlet  and  the  top  outlet  (about  1  cm  away  from  the  boundaries),  and  by  measuring 
the  density  gradient  inside  the  flue  channel  to  calculate  the  pressure  drop  along  the  full  length  of 
the  flue  channel. 

®The  contours  of  hgure  7-12  are  not  as  nice  and  smooth  as  the  contours  of,  say,  hgure  1-11  because 
the  present  data  was  saved  on  disk  at  4  times  lower  resolution  than  hgure  1-11. 
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hmean 
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A2 
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Hz 

(cm) 

10“® 

Hz 

(cm) 

10“® 

Hz 

(cm) 

10“® 

Hz 

(cm) 

10“® 

1197 

1321 

(26) 

0.99 

667 

(52) 

5.75 

780 

(44) 

1.93 

1512 

(23) 

1.84 

Table  7.2:  Frequencies,  lattice  Boltzmann,  22  cm  open-end  recorder 


hmean 
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Hz 

(cm) 

10-1 

Hz 
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10-2 

Hz 

(cm) 

10-® 

Hz 
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10-® 

691 

(50) 

6.39 

1381 

(25) 

2.24 

2071 

(17) 

0.97 

2761 

(12) 

0.468 

Table  7.3:  Frequencies,  physical  measurements,  22  cm  open-end  recorder 


22  cm  pipe 

/o 

(Ao) 

/l 

(Ai) 

/2 

(A2) 

/s 

(As) 

/4 

(A4) 

Hz 

(cm) 

Hz 

(cm) 

Hz 

(cm) 

Hz 

(cm) 

Hz 

(cm) 

open-closed 

391 

(88) 

1173 

(29) 

1955 

(18) 

2736 

(13) 

3518 

(10) 

open-open 

782 

(44) 

1564 

(22) 

2345 

(14.7) 

3127 

(11) 

3909 

(8.8) 

Table  7.4:  Ideal  resonant  frequencies,  22  cm,  open-closed  and  open-open. 


Figure  7-10:  Lattice  Boltzmann  simulation  of  22  cm  open-end  recorder,  31.2  ms  after 
startup,  blowing  speed  1197  cm/s,  iso-vorticity  contours. 
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Figure  7-11:  Frames  left  to  right  0.2169  ms  apart,  velocity  vector  field,  22  cm  open-end 
recorder,  29.5  ms  after  startup,  blowing  speed  1197  cm/s. 


Figure  7-12:  Frames  left  to  right  0.2169  ms  apart,  iso-vorticity  contours,  22  cm  open- 
end  recorder,  29.5  ms  after  startup,  blowing  speed  1197  cm/s. 
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Figure  7-13:  Lattice  Boltzmann  method,  22  cm  open-end  soprano  recorder,  blowing 
velocity  1197  cm/s,  sampled  5  cm  above  the  labium. 


Figure  7-14:  Lattice  Boltzmann  method,  22  cm  open-end  soprano  recorder,  blowing 
velocity  1197  cm/s,  sampled  1.34  cm  below  the  labium. 
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Figure  7-15:  Physical  measurements,  22  cm  open-end  recorder,  steady  state.  Arbi¬ 
trary  units  of  amplitude. 


Figure  7-16:  Physical  measurements,  22  cm  open-end  recorder,  startup  transient. 


Chapter  8 
Conclusion 

8.1  What  has  been  accomplished 

After  all  the  work  is  done,  comes  the  point  when  we  ask, 

Are  we  any  better  off  than  when  we  started  ? 

I  think  that  the  answer  is  “YES”  in  a  number  of  ways.  First,  the  big  picture  is  that  a 
previously  unexplored  area  of  fluid  dynamics  has  succumbed  to  computer  simulation. 
Using  parallel  computing  on  a  cluster  of  non-dedicated  workstations,  the  hrst  simu¬ 
lations  of  hydrodynamics  and  acoustic  waves  inside  wind  musical  instruments  have 
been  performed.  Further,  the  simulations  are  in  reasonable  agreement  with  physical 
measurements  of  the  acoustic  signal  of  various  flue  pipes.  Prior  to  my  thesis,  there 
were  doubts  whether  the  simulation  of  flue  pipes  using  the  compressible  Navier  Stokes 
equations  is  feasible.  Some  of  the  difficulties  which  seemed  un- surmount  able  are  the 
following. 

•  Whether  enough  compute  cycles  can  be  found  (very  small  integration  time  steps 
must  be  used). 

•  Whether  two-dozen  non-dedicated  workstations  in  my  research  group  can  be 
harnessed  to  perform  intensive  parallel  computing  for  days  and  weeks  without 
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disturbing  the  regular  users. 

•  Whether  the  numerical  stability  problems  (slow-growing  high-frequency  oscilla¬ 
tions)  which  arise  in  simulations  of  subsonic  compressible  flow  can  be  handled. 

•  Whether  the  lattice  Boltzmann  method  (one  of  the  numerical  methods  I  use) 
can  work  at  all. 

•  Whether  uniform  grids  can  be  successful  in  simulating  the  sharp  edge  (labium) 
of  a  flue  pipe. 

My  thesis  has  not  found  the  best  solutions  to  these  problems,  but  has  found  some 
good-enough  solutions,  and  this  is  the  hrst  step. 

The  approach  presented  here  can  be  easily  applied  to  other  problems.  In  particu¬ 
lar,  the  numerical  techniques  of  my  thesis  are  generally  applicable  to  any  flow  problem 
of  compressible  subsonic  flow.  Also,  the  programming  techniques  and  the  organiza¬ 
tion  of  my  parallel  simulation  system  on  a  cluster  of  non-dedicated  workstations  can 
be  applied  to  any  problem  that  involves  local-interactions  and  a  static  decomposi¬ 
tion  (vision  problems,  for  example).  My  parallel  system  is  very  simple  and  effective 
because  the  constraints  of  local  and  static  problems  have  been  fully  exploited. 

One  of  the  messages  of  my  thesis  is  that  a  cross-disciplinary  approach  is  needed  for 
solving  problems  in  scientihc  computing.  The  mathematics,  the  numerical  modeling, 
the  parallelization,  the  low-level  system  implementation,  the  sharing  of  the  worksta¬ 
tions,  the  different  software  abstractions  and  the  representations  of  the  problem,  and 
many  other  issues  have  all  been  considered  together  more-or-less  in  order  to  hnd  good 
effective  solutions.  In  other  words,  my  thesis  promotes  a  generalist’s  approach. 

Another  message  of  my  thesis  is  that  explicit  methods  are  very  promising  for  paral¬ 
lel  computing.  In  the  present  simulations,  there  is  a  match  between  the  requirements 
of  the  problem  (small  time  steps  for  subsonic  compressible  flow),  the  requirements  of 
explicit  methods,  and  the  requirements  of  the  computer  system  (small  communica¬ 
tion  capacity  on  a  cluster  of  workstations).  In  general,  however,  explicit  methods  are 
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desirable  for  parallel  computing  when  increasing  numbers  of  local  processing  units 
are  available  with  small  communication  capacity  between  the  processing  units.  Per¬ 
haps,  future  parallel  computers  will  consist  of  millions  of  local  processing  units,  each 
unit  having  the  power  of  one  of  today’s  workstations.  Communication  is  going  to 
dominate  the  cost  of  such  computers,  and  methods  that  minimize  communication  are 
going  to  be  desirable.  A  vision  of  such  immense  computers  has  guided  many  of  the 
approaches  of  my  thesis. 

Apart  from  the  big  picture,  my  thesis  has  also  numerous  detailed  results  to  of¬ 
fer.  One  result  is  the  demonstration  and  the  analysis  of  artihcial- viscosity  hlters  for 
mitigating  the  high-frequency  instabilities  of  subsonic  compressible  flow.  Another 
numerical  result  is  my  work  on  the  boundary  conditions  and  the  accuracy  of  the  lat¬ 
tice  boltzmann  method.  With  regard  to  distributed  computing,  the  simple  structure 
of  my  program,  and  the  automatic  process  migration  are  worth  remembering.  With 
regard  to  the  physics  of  musical  instruments,  the  detailed  pictures  of  the  jet  of  air 
oscillating  inside  a  flue  pipe  are  unique  and  very  important  for  studying  this  complex 
phenomenon. 

Directions  for  future  work  are  summarized  below. 


8.2  Ideas  for  future  work 

8.2.1  Physical  Applications 

•  Someday  soon,  it  may  be  possible  for  the  computer  to  hud  reduced  models  of 
flue  pipes  automatically  (see  section  7.2  for  an  introduction  to  reduced  models  of 
flue  pipes)  by  performing  a  few  preliminary  direct  simulations  of  flue  pipes,  and 
then  examining  the  results.  The  present  simulation  system  could  be  combined 
with  another  “intelligent”  program  which  knows  about  a  number  of  possible 
reduced  models,  and  tries  to  ht  the  best  model  to  the  data. 
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•  The  present  simulations  can  be  easily  extended  to  include  flue  pipes  with  huger 
holes,  and  also  pipes  which  are  simple  models  for  the  human  vocal  tract,  (see 
Shadle  [46]  for  a  simple  pipe  that  models  the  vocal  tract). 

•  The  present  approach  of  simulating  compressible  subsonic  huid  dynamics  has 
applications  in  the  design  of  oil  and  gas  carrying  pipes  [37],  and  perhaps  in 
the  study  of  medical  issues  such  as  the  acoustic  waves  inside  blood  arteries  and 
non-intrusive  measurements  of  arteriosclerosis  [24],  etc. 

•  New  applications  may  arise  in  the  future.  For  example,  undulating  jets  of 
burning  fuel  may  be  able  to  increase  the  efhciency  of  a  combustion  engine.  This 
is  a  very  distant  idea  at  present,  but  it  deserves  some  attention.  Computer 
simulations  such  as  the  ones  presented  here  will  be  very  important  in  such  future 
studies.  The  present  simulations  must  be  extended  to  model  heat  conduction 
and  two-phase  how. 

8.2.2  Parallel  computing 

•  We  have  seen  that  explicit  methods  are  highly-suitable  for  parallel  comput¬ 
ing,  but  require  very  small  time  steps  for  stability.  Between  implicit  methods 
(full  matrix  equation)  and  explicit  methods  (local-interactions)  there  may  exist 
intermediate  methods;  for  example,  methods  that  use  small  matrices  that  do 
not  extend  the  full  length  of  the  numerical  grid.  Such  methods  might  lead  to 
improved  numerical  stability  while  preserving  the  benehts  of  local-interaction 
algorithms  (see  section  3.2). 

•  Uniform  grids  such  as  the  ones  employed  here  are  very  simple  and  work  well, 
but  they  are  not  very  efhcient.  Non-uniform  grids  are  needed  in  order  to  focus 
the  computational  power  on  regions  where  it  is  mostly  needed  such  as  sharp 
obstacles.  Unstructured  non-uniform  grids  are  very  promising,  and  a  lot  of 
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research  is  currently  being  done  on  them  [6].  An  interesting  project  is  to  try  to 
develop  unstructured  grids  on  a  cluster  of  non-dedicated  workstations. 

8.2.3  Numerical  analysis 

•  Section  5.5  raises  some  interesting  questions  regarding  the  relationship  between 
artihcial- viscosity  hlters,  physical  turbulence,  and  perhaps  a  kind  of  “discrete 
turbulence”  which  is  a  property  of  systems  of  difference  equations  as  opposed 
to  differential  equations. 

•  There  is  a  need  to  develop  numerical  conditions  that  approximate  an  inhnite 
region  at  the  outlet  boundary  (see  section  7.3),  and  also  suitable  techniques  that 
remove  the  generated  vorticity  from  the  simulated  region  in  order  to  continue 
the  simulations  of  flue  pipes  for  indehnitely  long  periods  of  time  (see  sections  1.4 
and  7.3). 

•  A  comprehensive  theoretical  analysis  of  the  stability  and  the  accuracy  of  the 
lattice  Boltzmann  method  is  incomplete  at  the  present  time  (see  section  4.1.3). 
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